Merge with Dev #1000

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

borisarzentar wants to merge 116 commits into main from dev

Member

borisarzentar commented Jun 18, 2025

Description

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

Vasilije1990 and others added 30 commits

April 18, 2025 16:31


          merge

8afbecf


          Merge branch 'dev' of github.com:topoteretes/cognee into dev

7bdb2ab


          Merge branch 'dev' of github.com:topoteretes/cognee into dev

b35e047


          Merge branch 'dev' of github.com:topoteretes/cognee into dev

2a485f9


          Merge branch 'dev' of github.com:topoteretes/cognee into dev

f072e8d


          fix: run cognee in Docker [COG-1961] (#775)

98a1b79

<!-- .github/pull_request_template.md -->

## Description
Resolve issue with .venv being broken when using docker compose with
Cognee

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Boris Arzentar <[email protected]>


          Merge remote-tracking branch 'origin/main' into dev

17a77c5


          Merge branch 'dev' of github.com:topoteretes/cognee into dev

80e5edc


          fix: Doesn't drop entire PG database, just cleans public schema - Cog…

5aca3f0

… 1947 (#760)

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Boris <[email protected]>
Co-authored-by: Igor Ilic <[email protected]>
Co-authored-by: Igor Ilic <[email protected]>


          Merge branch 'dev' of github.com:topoteretes/cognee into dev

0a9e1a4


          Merge remote-tracking branch 'origin/main' into dev

79921f8


          feat: Add uv and poetry support to Cognee [COG-1572] (#780)

6109bf5

<!-- .github/pull_request_template.md -->

## Description
Add support for UV and for Poetry package management

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          fix: networkx id type change [COG-1876] (#786)

a627841

<!-- .github/pull_request_template.md -->

## Description
Switch typing from str to UUID for NetworkX node_id

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          Mcp SSE support [COG-1781] (#785)

c4915a4

<!-- .github/pull_request_template.md -->

## Description
Add both sse and stdio support for Cognee MCP

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          feat: Add detailed log handling options for Cognee exceptions [COG-19…

773752a

…83] (#782)

<!-- .github/pull_request_template.md -->

## Description
Add log handling options for cognee exceptions

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          fix: s3fs version fix [COG-2025] (#798)

66ecd35

<!-- .github/pull_request_template.md -->

## Description
Fix issue with failing versions gh actions

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          docs: add cognee UI (#799)

ad943d0

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Vasilije <[email protected]>


          feat: remove get_distance_from_collection_names and adapt search (#766)

cd9c489

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          docs: update colab demo (#795)

7db7422

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

Co-authored-by: Vasilije <[email protected]>


          feat: pass context argument to tasks that require it (#788)

5970d96

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Hande <[email protected]>
Co-authored-by: Vasilije <[email protected]>


          feat: abstract logging tool integration (#787)

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Hande <[email protected]>
Co-authored-by: Vasilije <[email protected]>


          merged

d417c71


          feat: Add Memgraph integration (#751)

5d415dc

<!-- .github/pull_request_template.md -->

## Description
This PR adds support for the Memgraph graph database following the
[graph database integration
guide](https://docs.cognee.ai/contributing/adding-providers/graph-db/graph-database-integration):
- Implemented `MemgraphAdapter` for interfacing with Memgraph.
- Updated `get_graph_engine.py` to return MemgraphAdapter when
appropriate.
- Added a test script:` test_memgraph.py.`
- Created a dedicated test workflow:
`.github/workflows/test_memgraph.yml.`

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Vasilije <[email protected]>
Co-authored-by: Boris <[email protected]>


          refactor: Handle boto3 s3fs dependencies better (#809)

34b95b6

<!-- .github/pull_request_template.md -->

## Description
refactor: Handle boto3 s3fs dependencies better

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          fix: Fixes collection search limit in brute force triplet search (#814)

a78fec3

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          refactor: Update lanceDB and change delete to work async (#770)

9c131f0

<!-- .github/pull_request_template.md -->

## Description
Update LanceDB and rewrite data points to run async

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Boris <[email protected]>
Co-authored-by: Boris Arzentar <[email protected]>


          fix: make onnxruntime flexible (#815)

f93463e

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          fix: graphiti example (#816)

8ea0097

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          feat: Create notebook to show how to compute ranks from graph (#771)

13bb244

<!-- .github/pull_request_template.md -->

## Description
As discussed with @hande-k and Lazar, I've created a short demo to
illustrate how to get the pagerank rankings from the knowledge graph
given the nx engine. This is a POC, and a first of step towards solving
#643 .

Please let me know what you think, and how to proceed from here. :)

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Boris <[email protected]>
Co-authored-by: Hande <[email protected]>
Co-authored-by: Vasilije <[email protected]>


          feat: add MCP check status tool [COG-1784] (#793)

966e337

<!-- .github/pull_request_template.md -->

## Description
Added tools to check current cognify and codify status

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

dexters1 and others added 20 commits

June 11, 2025 16:15


          Merge branch 'main' into resolve-main-commits

3818ff3


          Feature: Fixes url typo in general adapters (#969)

43df9aa

<!-- .github/pull_request_template.md -->

## Description
Fixes url typo in general adapters

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          Resolve main commits (#970)

d2e45a2

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          feat: websockets for pipeline update streaming (#851)

773b15a

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: hajdul88 <[email protected]>
Co-authored-by: lxobr <[email protected]>
Co-authored-by: Igor Ilic <[email protected]>
Co-authored-by: Hande <[email protected]>
Co-authored-by: Vasilije <[email protected]>


          docs: add dev branch note (#971)

dc7d5cf

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          chore: mcp readme rewrite (#974)

36dbda4

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          chore: Add update to mcp readme (#976)

d82b73e

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          chore: remove aws deployment (#975)

645e939

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          feat: Return CI test for docker build (#977)

23c9a77

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          fix:Extend postgres (#962)

48f5bf9

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Igor Ilic <[email protected]>
Co-authored-by: Igor Ilic <[email protected]>


          fix: Resolve gemini ollama cicd issue (#979)

842fb81

<!-- .github/pull_request_template.md -->

## Description
Resolve issue with ollama and gemini ci cd

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          Feature: Makes s3 pathway imports optional so cognee can run without …

21a4217

…s3fs (#978)

<!-- .github/pull_request_template.md -->

## Description
Makes s3 pathway imports optional so cognee can run without s3fs

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          Mcp test (#980)

456f3b5

<!-- .github/pull_request_template.md -->

## Description
Add test of MCP functionality and starting of MCP server, fix some MCP and LanceDB
issues

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          feat: replace Owlready2 with RDFLib (#981)

acdcb0e

<!-- .github/pull_request_template.md -->

## Description
Replaces Owlready2 with RDFLib

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Igor Ilic <[email protected]>


          feat: Add alembic migration for permissions (#989)

f3c8b66

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          Permissions db migration (#992)

f3455e9

<!-- .github/pull_request_template.md -->

## Description
Remove cognee models from permissions migration

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          docs: Add how to use postgres with Cognee docker compose (#994)

74000b4

<!-- .github/pull_request_template.md -->

## Description
Add how to use postgres with Cognee docker compose

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


          feat: migrate new UI to cognee (#966)

e7644f4

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Igor Ilic <[email protected]>


          test: add backend flow test (#995)

72f97bb

<!-- .github/pull_request_template.md -->

## Description
Simple Cognee endpoint testing

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Boris <[email protected]>


          version: 0.2.0.dev0 (#999)

43c14e7

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

borisarzentar self-assigned this

pull-checklist bot commented Jun 18, 2025

Please make sure all the checkboxes are checked:

I have tested these changes locally.
I have reviewed the code changes.
I have added end-to-end and unit tests (if applicable).
I have updated the documentation and README.md file (if necessary).
I have removed unnecessary code and debug statements.
PR title is clear and follows the convention.
I have tagged reviewers or team members for feedback.

Contributor

coderabbitai bot commented Jun 18, 2025 •

edited

Loading

Walkthrough

This update introduces comprehensive backend access control and dataset-scoped database management across the Cognee platform. It implements dataset-specific ACLs, context-aware database configuration, and permission-aware API endpoints. The pipeline execution model is refactored to support streaming pipeline run info and background processing. New exception classes, authentication strategies, and utility modules are added, while several legacy or redundant files are removed or replaced.

Changes

File(s) / Path(s)	Change Summary
`.env.template`	Clarified Postgres config for Docker Compose; added `ENABLE_BACKEND_ACCESS_CONTROL` variable.
`alembic/versions/ab7e313804ae_permission_system_rework.py`	Migration: ACLs now reference datasets, not data; grants dataset owners full permissions.
`cognee/context_global_variables.py`	New module for async context vars to manage per-dataset database configs.
`cognee/infrastructure/databases/graph/config.py`, `vector/config.py`	Added context-aware config retrieval for graph and vector DBs.
`cognee/infrastructure/databases/graph/get_graph_engine.py`, `vector/get_vector_engine.py`	Use context-aware config; support async adapter initialization and remote Kuzu adapter.
`cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py`	New adapter for remote Kuzu graph DB via REST API.
`cognee/infrastructure/databases/utils/get_or_create_dataset_database.py`	New async utility to fetch or create a dataset-specific DB record for a user.
`cognee/modules/data/models/Data.py`, `Dataset.py`, `users/models/ACL.py`	ACLs now relate to datasets, not data; updated relationships accordingly.
`cognee/modules/users/models/DatasetDatabase.py`	New model for mapping users and datasets to specific DB names.
`cognee/modules/data/methods/*`	Added/updated dataset retrieval, creation, and authorization methods; support for dataset IDs.
`cognee/modules/pipelines/operations/pipeline.py`, `run_tasks.py`	Pipeline execution now streams run info, supports context-aware DB config, and yields structured events.
`cognee/modules/pipelines/queues/pipeline_run_info_queues.py`	New module for managing async pipeline run info queues.
`cognee/modules/search/methods/search.py`	Permission-aware search: per-dataset context, user access checks, async execution.
`cognee/api/v1/datasets/routers/get_datasets_router.py`	Dataset endpoints now permission-aware; structured graph data responses; improved error handling.
`cognee/api/v1/cognify/routers/get_cognify_router.py`	Added WebSocket endpoint for pipeline run subscription; supports dataset IDs and background runs.
`cognee/api/v1/add/add.py`, `routers/get_add_router.py`	Support for dataset IDs and context-aware DB configs in data ingestion.
`cognee/api/v1/search/routers/get_search_router.py`, `search.py`	Accepts dataset IDs; permission checks for search.
`cognee/modules/users/authentication/*`, `get_fastapi_users.py`	Refactored authentication: new API/client JWT strategies, transports, and backends.
`cognee/modules/users/methods/get_authenticated_user.py`	Replaced manual JWT extraction with FastAPI Users dependency.
`cognee/modules/users/models/User.py`, `Tenant.py`	User model now inherits from FastAPI base; added owner_id to Tenant.
`alembic/versions/ab7e313804ae_permission_system_rework.py`	Migration for new ACL/dataset structure.
`cognee/modules/pipelines/models/PipelineRunInfo.py`	New models for pipeline run info events (started, yield, completed, errored).
`cognee/modules/graph/methods/get_formatted_graph_data.py`	New utility to format graph data for API responses.
`cognee/modules/users/models/__init__.py`	Expose new DatasetDatabase model.
`cognee/modules/data/exceptions/exceptions.py`, `users/exceptions/exceptions.py`	New exception classes for datasets, permissions, and data not found.
`cognee/modules/ontology/rdf_xml/OntologyResolver.py`	Refactored to use rdflib instead of owlready2 for ontology parsing.
`cognee/modules/pipelines/utils/generate_pipeline_id.py`, `generate_pipeline_run_id.py`	New deterministic ID generators for pipelines and runs.
`cognee/modules/pipelines/operations/log_pipeline_run_initiated.py`, `log_pipeline_run_start.py`	Use new pipeline/run ID generators.
`cognee/modules/pipelines/methods/get_pipeline_run.py`	New async function to fetch pipeline run by ID.
`cognee/modules/ingestion/classify.py`	Dynamic import of S3 file classes, improved error handling.
`cognee/modules/graph/utils/expand_with_nodes_and_edges.py`	Use string-based type checks for ontology node categories.
`cognee/modules/metrics/operations/get_pipeline_run_metrics.py`	Now processes a single pipeline run info object instead of a list.
`cognee/modules/retrieval/graph_completion_retriever.py`	Improved node content selection and logging.
`cognee/modules/retrieval/utils/brute_force_triplet_search.py`	Docstring updated for node filters.
`cognee/modules/search/operations/get_history.py`	Conditional limit application for search history queries.
`cognee/modules/data/models/Data.py`, `ingestion/DataView/DataView.tsx`, `wizard/*`, etc.	Removed or refactored legacy, redundant, or now-incompatible files.
... (many other files, see summary above)	Additional refactors, utility additions, and minor bugfixes.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant API
    participant Auth
    participant DB
    participant Pipeline
    participant Queue

    User->>API: POST /v1/datasets/ (with name)
    API->>DB: Check if dataset exists
    alt Exists
        DB-->>API: Return existing dataset
    else Not exists
        API->>DB: Create dataset, assign owner
        DB-->>API: Return new dataset
    end
    API-->>User: Dataset info

    User->>API: POST /v1/add/ (with dataset_id)
    API->>DB: Ingest data to dataset
    DB-->>API: Data added
    API-->>User: Confirmation

    User->>API: POST /v1/cognify/ (with dataset_ids, run_in_background)
    API->>Pipeline: Start pipeline run (per dataset)
    Pipeline->>Queue: Stream PipelineRunInfo events
    API-->>User: PipelineRunInfo (initial)

    User->>API: WS /v1/cognify/subscribe/{pipeline_run_id}
    API->>Auth: Authenticate user via JWT
    Auth-->>API: User info
    loop Until completed
        Queue-->>API: PipelineRunInfo event
        API-->>User: Send event (activity, graph data, etc.)
    end

    User->>API: POST /v1/search/ (with dataset_ids)
    API->>DB: Check user permissions for datasets
    API->>DB: Run search per dataset (context-aware)
    DB-->>API: Search results
    API-->>User: Aggregated results

Possibly related PRs

topoteretes/cognee#903: Updates Neo4j adapter to use a base node label and uniqueness constraint, directly related to graph DB changes in this PR.
topoteretes/cognee#260: Implements dataset-scoped backend access control and ACL restructuring, overlapping with this PR's dataset/permission model changes.
topoteretes/cognee#953: Adds the ENABLE_BACKEND_ACCESS_CONTROL environment variable, same as introduced here for backend access control.

Poem

A rabbit hops through fields of code,
Where datasets now have keys and modes.
Permissions guard each graph and node,
Context-aware, the queries flowed.
Pipelines stream their status true,
Backend access—brand new too!
🐇 Cheers to change, and onward we go!

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate Unit Tests

Create PR with Unit Tests
Post Copyable Unit Tests in Comment
Commit Unit Tests in branch dev

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai auto-generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

borisarzentar marked this pull request as ready for review

June 18, 2025 19:15

gitguardian bot commented Jun 18, 2025

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
17350889	Triggered	Generic Password	`4eb71cc`	cognee/tests/test_remote_kuzu.py	View secret
17116131	Triggered	Generic Password	`3b07f3c`	examples/database_examples/neo4j_example.py	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secrets safely. Learn here the best practices.
Revoke and rotate these secrets.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

github-advanced-security bot found potential problems

View reviewed changes

.github/workflows/e2e_tests.yml

Comment on lines +224 to +250

    
                  name: Test using different async databases in parallel in Cognee

                  runs-on: ubuntu-22.04

                  steps:

                    - name: Check out repository

                      uses: actions/checkout@v4

                    - name: Cognee Setup

                      uses: ./.github/actions/cognee_setup

                      with:

                        python-version: '3.11.x'

                    - name: Install specific graph db dependency

                      run: |

                        poetry install -E kuzu

                    - name: Run parallel databases test

                      env:

                        ENV: 'dev'

                        LLM_MODEL: ${{ secrets.LLM_MODEL }}

                        LLM_ENDPOINT: ${{ secrets.LLM_ENDPOINT }}

                        LLM_API_KEY: ${{ secrets.LLM_API_KEY }}

                        LLM_API_VERSION: ${{ secrets.LLM_API_VERSION }}

                        EMBEDDING_MODEL: ${{ secrets.EMBEDDING_MODEL }}

                        EMBEDDING_ENDPOINT: ${{ secrets.EMBEDDING_ENDPOINT }}

                        EMBEDDING_API_KEY: ${{ secrets.EMBEDDING_API_KEY }}

                        EMBEDDING_API_VERSION: ${{ secrets.EMBEDDING_API_VERSION }}

                      run: poetry run python ./cognee/tests/test_parallel_databases.py

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 6 months ago

To fix the issue, add a permissions block to the workflow file. This block should specify the least privileges required for the workflow to function correctly. Since the workflow does not perform any write operations on the repository, the contents: read permission is sufficient. This change should be applied at the root level of the workflow to cover all jobs unless specific jobs require additional permissions.

Suggested changeset 1

.github/workflows/e2e_tests.yml

@@ -2,2 +2,5 @@
+            permissions:
+              contents: read
             on:

Copilot is powered by AI and may make mistakes. Always verify output.

Unable to commit as this autofix suggestion is now outdated

.github/workflows/test_mcp.yml

Comment on lines +8 to +52

    
                  name: Run MCP Test

                  runs-on: ubuntu-22.04

                  steps:

                    - name: Check out repository

                      uses: actions/checkout@v4

                    - name: Set up Python

                      uses: actions/setup-python@v5

                      with:

                        python-version: ${{ inputs.python-version }}

                    - name: Install UV

                      shell: bash

                      run: |

                        python -m pip install --upgrade pip

                        pip install uv

                    # This will install all dependencies along with Cognee version deployed on PIP

                    - name: Install dependencies

                      shell: bash

                      working-directory: cognee-mcp

                      run: uv sync

                    # NEW: swap in current local cognee branch version

                    - name: Override with cognee branch checkout

                      working-directory: cognee-mcp

                      run: |

                       # Remove Cognee wheel that came from PyPI

                       uv pip uninstall cognee

                       # Install of the freshly-checked-out Cognee branch

                       uv pip install --force-reinstall -e ../

                    - name: Run MCP test

                      env:

                        ENV: 'dev'

                        LLM_MODEL: ${{ secrets.LLM_MODEL }}

                        LLM_ENDPOINT: ${{ secrets.LLM_ENDPOINT }}

                        LLM_API_KEY: ${{ secrets.LLM_API_KEY }}

                        LLM_API_VERSION: ${{ secrets.LLM_API_VERSION }}

                        EMBEDDING_MODEL: ${{ secrets.EMBEDDING_MODEL }}

                        EMBEDDING_ENDPOINT: ${{ secrets.EMBEDDING_ENDPOINT }}

                        EMBEDDING_API_KEY: ${{ secrets.EMBEDDING_API_KEY }}

                        EMBEDDING_API_VERSION: ${{ secrets.EMBEDDING_API_VERSION }}

                      working-directory: cognee-mcp

                      run: uv run --no-sync python ./src/test_client.py

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium test

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 6 months ago

To fix the issue, we need to add a permissions block to the workflow. This block should specify the least privileges required for the workflow to function correctly. Since the workflow primarily interacts with the repository's contents and does not perform write operations, we can set contents: read as the permission. This ensures the workflow has only read access to the repository's contents, reducing the risk of unintended modifications.

The permissions block should be added at the root level of the workflow to apply to all jobs. Alternatively, it can be added to the specific job (test-mcp) if different jobs require different permissions.

Suggested changeset 1

.github/workflows/test_mcp.yml

@@ -5,2 +5,5 @@
+            permissions:
+              contents: read
             jobs:

Copilot is powered by AI and may make mistakes. Always verify output.

coderabbitai bot reviewed

View reviewed changes

Contributor

coderabbitai bot left a comment

Actionable comments posted: 81

🔭 Outside diff range comments (2)

cognee/infrastructure/databases/vector/qdrant/QDrantAdapter.py (1)
402-413: Fix undefined variable issue in zero-limit search.

The collection_size variable is only defined when limit == 0, but it's used unconditionally on line 413, which will cause a NameError.
         client = self.get_qdrant_client()
+        search_limit = limit
         if limit == 0:
             collection_size = await client.count(collection_name=collection_name)
+            search_limit = collection_size.count

         results = await client.search(
             collection_name=collection_name,
             query_vector=models.NamedVector(
                 name="text",
                 vector=query_vector
                 if query_vector is not None
                 else (await self.embed_data([query_text]))[0],
             ),
-            limit=limit if limit > 0 else collection_size.count,
+            limit=search_limit,
             with_vectors=with_vector,
         )
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (1)
115-122: Verify the query method call should be awaited.

The has_node method calls self.query() but doesn't await it, which will cause issues since query is an async method.

Apply this fix:
-        results = self.query(
+        results = await self.query(

🧹 Nitpick comments (91)

CONTRIBUTING.md (1)
1-3: Use GitHub-compatible admonition syntax
The [!IMPORTANT] tag is an Azure DevOps/Markdown extension and won’t render on GitHub. To ensure it displays correctly, merge the two lines into a single blockquote with a bold prefix.
- > [!IMPORTANT]
- > **Note for contributors:** When branching out, create a new branch from the `dev` branch.
+ > **Important:** When branching out, create a new branch from the `dev` branch.
cognee/modules/retrieval/utils/description_to_codepart_search.py (1)

2-2: Logging import placement and usage
The setup_logging and ERROR imports are only used in the __main__ block. Consider moving these imports inside that block or removing them if the script entrypoint is rarely executed.

cognee/modules/data/exceptions/__init__.py (1)

10-11: Re-exported exceptions trigger unused‐import warnings
These imports are intended to expose the new exception classes but raise F401. Add an explicit __all__ = ["UnstructuredLibraryImportError", "UnauthorizedDataAccessError", "DatasetNotFoundError", "DatasetTypeError"] or append # noqa: F401 to suppress the lint errors.

cognee/modules/pipelines/methods/__init__.py (1)

1-1: Re-export pattern missing __all__
Importing get_pipeline_run here is for package‐level API but triggers unused‐import. Define __all__ = ["get_pipeline_run"] or add # noqa: F401 to clarify its re‐export purpose.

cognee/modules/users/authentication/default/__init__.py (1)

1-2: Clarify public API exports
Re‐exporting default_transport and DefaultJWTStrategy without an __all__ list causes unused‐import warnings. Add __all__ = ["default_transport", "DefaultJWTStrategy"] or use # noqa: F401 to document intended exports.

cognee/modules/users/models/__init__.py (1)

4-4: Exported model import unused by module
The DatasetDatabase import is for namespacing but raises F401. Please add it to an __all__ list or append # noqa: F401 to indicate this is an intentional re‐export.
cognee-frontend/src/ui/Partials/SettingsModal/SettingsModal.tsx (1)
1-10: Consider removing commented code entirely.

The entire component has been commented out, which creates code clutter and maintenance overhead. If this component is permanently deprecated, consider removing it entirely rather than keeping commented code.

If this is temporary, consider adding a TODO comment explaining when/why it will be restored.
-// import { Modal } from 'ohmy-ui';
-// import Settings from './Settings';
-
-// export default function SettingsModal({ isOpen = false, onClose = () => {} }) {
-//   return (
-//     <Modal isOpen={isOpen} onClose={onClose}>
-//       <Settings onDone={onClose} />
-//     </Modal>
-//   )
-// }
+// TODO: SettingsModal temporarily disabled - remove if permanently deprecated
cognee-frontend/src/ui/App/Loading/DefaultLoadingIndicator/LoadingIndicator.module.css (1)
6-6: Consider maintaining themability with CSS variables.

The change from var(--global-color-primary) to hardcoded white reduces the component's themability. If the loading indicator needs to be white specifically, consider using a CSS variable like var(--loading-indicator-color, white) to maintain flexibility.
-  border: 0.18rem solid white;
+  border: 0.18rem solid var(--loading-indicator-color, white);
cognee-frontend/src/app/layout.tsx (1)
9-9: Consider revising description for production readiness.

The description "Cognee Dev Mexican Standoff" appears to be a development placeholder. Consider updating to a more professional description that accurately describes the application's purpose before production deployment.
-  description: "Cognee Dev Mexican Standoff",
+  description: "Cognee - AI-powered knowledge management platform",
cognee-frontend/src/modules/datasets/getDatasetGraph.ts (1)
3-6: Consider adding error handling for robustness.

The function implementation is clean and straightforward. However, consider adding error handling for network failures and invalid JSON responses to improve user experience.

Consider this enhanced version with error handling:
 export default function getDatasetGraph(dataset: { id: string }) {
   return fetch(`/v1/datasets/${dataset.id}/graph`)
-      .then((response) => response.json());
+      .then((response) => {
+        if (!response.ok) {
+          throw new Error(`Failed to fetch dataset graph: ${response.status}`);
+        }
+        return response.json();
+      })
+      .catch((error) => {
+        console.error('Error fetching dataset graph:', error);
+        throw error;
+      });
 }
cognee-frontend/src/modules/datasets/createDataset.ts (1)
3-12: Add error handling for create operations.

The function implementation follows a consistent pattern with other API functions. However, adding error handling would improve robustness, especially for create operations where users need clear feedback on success or failure.

Consider this enhanced version with error handling:
 export default function createDataset(dataset: { name: string }) {
   return fetch(`/v1/datasets/`, {
     method: "POST",
     body: JSON.stringify(dataset),
     headers: {
       "Content-Type": "application/json",
     }
   })
-    .then((response) => response.json());
+    .then((response) => {
+      if (!response.ok) {
+        throw new Error(`Failed to create dataset: ${response.status}`);
+      }
+      return response.json();
+    })
+    .catch((error) => {
+      console.error('Error creating dataset:', error);
+      throw error;
+    });
 }
cognee/modules/users/methods/get_user.py (1)
4-4: Remove unused import.

The sqlalchemy.exc import is unused and should be removed.
-import sqlalchemy.exc
cognee/api/v1/delete/exceptions.py (1)
29-39: Fix the docstring description.

The exception class implementation is correct, but the docstring has an incorrect description.
-    """Raised when a dataset cannot be found."""
+    """Raised when data cannot be found."""
cognee-frontend/src/app/auth/signup/SignUpPage.tsx (1)
12-12: Accessibility: Add proper alt text and consider responsive image handling

The logo alt text could be more descriptive for screen readers, and the fixed dimensions might not be responsive.
-        <Image src="/images/cognee-logo-with-text.png" alt="Cognee logo" width={176} height={46} className="h-12 w-44 self-center mb-16" />
+        <Image src="/images/cognee-logo-with-text.png" alt="Cognee application logo" width={176} height={46} className="h-12 w-44 self-center mb-16" priority />
cognee-frontend/src/modules/ingestion/useDatasets.ts (1)
15-16: Consider using a more specific TypeScript type instead of disabling the lint rule

The ESLint disable for any type could be avoided by using a more specific type for the timeout.
-  // eslint-disable-next-line @typescript-eslint/no-explicit-any
-  const statusTimeout = useRef<any>(null);
+  const statusTimeout = useRef<NodeJS.Timeout | null>(null);
cognee/modules/users/methods/get_user_by_email.py (2)
16-16: Consider using selectinload instead of joinedload for better performance

For one-to-many relationships like roles, selectinload typically performs better than joinedload as it avoids cartesian products.
-                .options(joinedload(User.roles), joinedload(User.tenant))
+                .options(selectinload(User.roles), joinedload(User.tenant))
Add the import:
-from sqlalchemy.orm import joinedload
+from sqlalchemy.orm import joinedload, selectinload
9-21: Consider adding error handling for database connection issues

While the current implementation is clean, consider wrapping the database operations in a try-catch block for production robustness.
async def get_user_by_email(user_email: str):
    db_engine = get_relational_engine()

+    try:
        async with db_engine.get_async_session() as session:
            user = (
                await session.execute(
                    select(User)
                    .options(joinedload(User.roles), joinedload(User.tenant))
                    .where(User.email == user_email)
                )
            ).scalar()

            return user
+    except Exception as e:
+        logger.error(f"Error retrieving user by email {user_email}: {e}")
+        raise
cognee-frontend/src/ui/elements/index.ts (1)
1-8: Consider adding JSDoc comments for better developer experience

Adding brief descriptions of each component could improve the developer experience when using these exports.
+/**
+ * UI Elements - Core component library
+ */
+
+/** Modal dialog component */
export { default as Modal } from "./Modal";
+/** Form input component */
export { default as Input } from "./Input";
+/** Dropdown select component */
export { default as Select } from "./Select";
+/** Multi-line text input component */
export { default as TextArea } from "./TextArea";
+/** Primary call-to-action button */
export { default as CTAButton } from "./CTAButton";
+/** Secondary ghost button */
export { default as GhostButton } from "./GhostButton";
+/** Neutral styled button */
export { default as NeutralButton } from "./NeutralButton";
+/** Status indicator component */
export { default as StatusIndicator } from "./StatusIndicator";
cognee-frontend/src/app/auth/token/route.ts (1)
4-4: Consider removing unused parameter instead of disabling eslint.

The request parameter is marked as unused with an eslint disable comment. If it's truly not needed, consider removing it from the function signature for cleaner code.
-// eslint-disable-next-line @typescript-eslint/no-unused-vars
-export async function GET(request: Request) {
+export async function GET() {
cognee-frontend/src/app/(graph)/getColorForNodeType.ts (1)
20-22: Consider improving type safety.

The current implementation uses keyof typeof NODE_COLORS which provides some type safety, but you could enhance it further by defining a union type for valid node types.
+type NodeType = keyof typeof NODE_COLORS;

-export default function getColorForNodeType(type: string) {
-  return NODE_COLORS[type as keyof typeof NODE_COLORS] || colors.gray[500];
+export default function getColorForNodeType(type: string): string {
+  return NODE_COLORS[type as NodeType] || formatHex(colors.gray[500]);
}
Note: Also consider using formatHex() for the default color to maintain consistency.
cognee-frontend/src/app/(graph)/GraphLegend.tsx (1)
11-11: Document the 100-node limit rationale.

The arbitrary limit of 100 nodes for legend generation isn't explained. Consider documenting why this limit exists or making it configurable.

Add a comment explaining the rationale:
+  // Limit to first 100 nodes to prevent performance issues with large datasets
   for (let i = 0; i < Math.min(data?.length || 0, 100); i++) {
cognee/modules/data/methods/get_dataset_ids.py (1)
9-20: Clean up function documentation.

The docstring mentions a pipeline_name parameter that doesn't exist in the function signature. This should be removed for clarity.

Apply this diff to clean up the documentation:
     """
     Function returns dataset IDs necessary based on provided input.
     It transforms raw strings into real dataset_ids with keeping write permissions in mind.
     If a user wants to write to a dataset he is not the owner of it must be provided through UUID.
     Args:
         datasets:
-        pipeline_name:
         user:

     Returns: a list of write access dataset_ids if they exist

     """
cognee-frontend/src/ui/elements/Modal.tsx (1)
6-12: Consider adding accessibility and UX improvements.

The modal lacks essential accessibility features and user experience enhancements:

ARIA attributes for screen readers

Focus management and focus trap

Escape key to close

Click outside to close functionality

Prevent body scrolling when modal is open

Consider enhancing the modal with these features:
+import React, { useEffect, useRef } from 'react';
+
 interface ModalProps {
   isOpen: boolean;
   children: React.ReactNode;
+  onClose?: () => void;
 }

-export default function Modal({ isOpen, children }: ModalProps) {
+export default function Modal({ isOpen, children, onClose }: ModalProps) {
+  const modalRef = useRef<HTMLDivElement>(null);
+
+  useEffect(() => {
+    const handleEscape = (event: KeyboardEvent) => {
+      if (event.key === 'Escape' && onClose) {
+        onClose();
+      }
+    };
+
+    if (isOpen) {
+      document.addEventListener('keydown', handleEscape);
+      document.body.style.overflow = 'hidden';
+    }
+
+    return () => {
+      document.removeEventListener('keydown', handleEscape);
+      document.body.style.overflow = 'unset';
+    };
+  }, [isOpen, onClose]);
+
+  const handleBackdropClick = (event: React.MouseEvent) => {
+    if (event.target === modalRef.current && onClose) {
+      onClose();
+    }
+  };
+
   return isOpen && (
-    <div className="fixed top-0 left-0 right-0 bottom-0 backdrop-blur-lg z-50 flex items-center justify-center">
+    <div 
+      ref={modalRef}
+      className="fixed top-0 left-0 right-0 bottom-0 backdrop-blur-lg z-50 flex items-center justify-center"
+      onClick={handleBackdropClick}
+      role="dialog"
+      aria-modal="true"
+    >
       {children}
     </div>
   );
 }
cognee/infrastructure/databases/graph/kuzu/show_remote_kuzu_stats.py (1)
6-6: Add documentation for the script purpose.

Consider adding a docstring to explain what this script does and when to use it.
+"""
+Script to display statistics from a remote Kuzu graph database.
+Shows node count, edge count, and sample data for debugging/monitoring purposes.
+"""
+
async def main():
cognee-frontend/src/ui/elements/StatusIndicator.tsx (1)
2-7: Extract color mapping to constants for better maintainability.

Hardcoded color values make the component harder to maintain and test. Consider extracting these to constants.
+const STATUS_COLORS = {
+  DATASET_PROCESSING_STARTED: "#ffd500",
+  DATASET_PROCESSING_INITIATED: "#ffd500", 
+  DATASET_PROCESSING_COMPLETED: "#53ff24",
+  DATASET_PROCESSING_ERRORED: "#ff5024",
+} as const;
+
 export default function StatusIndicator({ status }: { status: DatasetStatus }) {
-  const statusColor = {
-    DATASET_PROCESSING_STARTED: "#ffd500",
-    DATASET_PROCESSING_INITIATED: "#ffd500",
-    DATASET_PROCESSING_COMPLETED: "#53ff24",
-    DATASET_PROCESSING_ERRORED: "#ff5024",
-  };
cognee/modules/users/authentication/get_api_auth_backend.py (1)
1-11: Organize imports for better readability.

Consider grouping related imports together and separating third-party from local imports.
 import os
 from functools import lru_cache
+
 from fastapi_users import models
-
 from fastapi_users.authentication import (
     JWTStrategy,
     AuthenticationBackend,
 )

 from .api_bearer import api_bearer_transport, APIJWTStrategy
cognee/modules/users/authentication/get_client_auth_backend.py (2)
18-18: Move import to module level for better performance.

Importing inside the function can impact performance, especially with caching. Consider moving this to the top of the file.
+from .default.default_jwt_strategy import DefaultJWTStrategy
+
 @lru_cache
 def get_client_auth_backend():
     transport = default_transport

     def get_jwt_strategy() -> JWTStrategy[models.UP, models.ID]:
-        from .default.default_jwt_strategy import DefaultJWTStrategy
13-30: Consider refactoring to reduce code duplication.

The structure is very similar to get_api_auth_backend.py. Consider creating a shared helper function to reduce duplication.
+def _create_auth_backend(transport, strategy_class, lifetime_seconds):
+    def get_jwt_strategy() -> JWTStrategy[models.UP, models.ID]:
+        secret = os.getenv("FASTAPI_USERS_JWT_SECRET")
+        if not secret:
+            raise ValueError("FASTAPI_USERS_JWT_SECRET environment variable is required")
+        return strategy_class(secret, lifetime_seconds=lifetime_seconds)
+    
+    return AuthenticationBackend(
+        name=transport.name,
+        transport=transport,
+        get_strategy=get_jwt_strategy,
+    )
+
 @lru_cache
 def get_client_auth_backend():
-    transport = default_transport
-
-    def get_jwt_strategy() -> JWTStrategy[models.UP, models.ID]:
-        secret = os.getenv("FASTAPI_USERS_JWT_SECRET", "super_secret")
-        return DefaultJWTStrategy(secret, lifetime_seconds=3600)
-
-    auth_backend = AuthenticationBackend(
-        name=transport.name,
-        transport=transport,
-        get_strategy=get_jwt_strategy,
-    )
-
-    return auth_backend
+    return _create_auth_backend(default_transport, DefaultJWTStrategy, 3600)
cognee-frontend/src/ui/elements/CTAButton.tsx (1)
6-6: Improve readability by breaking up the long className string.

The long className string makes the code harder to read and maintain. Consider breaking it into multiple lines or extracting it to a variable.
-    <button className={classNames("flex flex-row justify-center items-center gap-2 cursor-pointer rounded-3xl bg-indigo-600 px-4 py-3 text-white hover:bg-indigo-500 focus-visible:outline-2 focus-visible:outline-offset-2 focus-visible:outline-indigo-600", className)} {...props}>{children}</button>
+    <button 
+      className={classNames(
+        "flex flex-row justify-center items-center gap-2 cursor-pointer rounded-3xl",
+        "bg-indigo-600 px-4 py-3 text-white hover:bg-indigo-500",
+        "focus-visible:outline-2 focus-visible:outline-offset-2 focus-visible:outline-indigo-600",
+        className
+      )} 
+      {...props}
+    >
+      {children}
+    </button>
cognee-frontend/src/ui/Partials/FeedbackForm.tsx (1)
51-67: Consider adding client-side validation.

The form currently lacks client-side validation for the feedback field. Consider adding basic validation to improve user experience.
       <div className="mb-4">
         <label className="block text-white" htmlFor="feedback">Feedback on agent&apos;s reasoning</label>
-        <TextArea id="feedback" name="feedback" type="text" placeholder="Your feedback" />
+        <TextArea 
+          id="feedback" 
+          name="feedback" 
+          type="text" 
+          placeholder="Your feedback" 
+          required 
+          minLength={10}
+        />
       </div>
cognee/eval_framework/evaluation/deep_eval_adapter.py (1)
27-52: Robust retry implementation with minor suggestions for improvement.

The retry logic is well-implemented with exponential backoff and proper logging. Consider these enhancements:

Specify exception types: Catching all exceptions might mask programming errors

Consider shorter maximum delay: The current exponential backoff can result in 16-second delays on the final retry
-            except Exception as e:
+            except (ConnectionError, TimeoutError, RuntimeError) as e:
Also consider adding a maximum delay cap:
-                    time.sleep(2**attempt)  # Exponential backoff
+                    time.sleep(min(2**attempt, 8))  # Cap at 8 seconds
cognee-frontend/src/app/auth/AuthPage.tsx (1)
8-9: Clean conversion to async server component with potential error handling consideration.

The conversion to an async server component is well-executed. Consider adding error handling for the session retrieval:
export default async function AuthPage() {
+  let session;
+  try {
-    const session = await auth0.getSession();
+    session = await auth0.getSession();
+  } catch (error) {
+    console.error('Failed to get session:', error);
+    session = null;
+  }
cognee/infrastructure/databases/utils/get_or_create_dataset_database.py (1)
33-36: Consider validating dataset_id before using it for file naming.

The function generates database file names directly from the dataset_id without validation. Consider adding validation to ensure the dataset_id doesn't contain filesystem-unsafe characters.
 dataset_id = await get_unique_dataset_id(dataset, user)
+
+# Validate dataset_id for filesystem safety
+if not dataset_id or any(char in str(dataset_id) for char in ['/', '\\', '..', '\0']):
+    raise ValueError(f"Invalid dataset_id for filesystem usage: {dataset_id}")

 vector_db_name = f"{dataset_id}.lance.db"
 graph_db_name = f"{dataset_id}.pkl"
cognee-frontend/src/app/(graph)/ActivityLog.tsx (1)
46-46: Fix inconsistent Tailwind CSS class usage.

The class flex-1/3 and flex-2/3 are not standard Tailwind classes. Use proper flex basis classes.
-<span className="flex-1/3 text-xs text-gray-300 whitespace-nowrap mt-1.5">{formatter.format(activity.timestamp)}: </span>
-<span className="flex-2/3 text-white  whitespace-normal">{activity.activity}</span>
+<span className="flex-none w-1/3 text-xs text-gray-300 whitespace-nowrap mt-1.5">{formatter.format(activity.timestamp)}: </span>
+<span className="flex-1 text-white whitespace-normal">{activity.activity}</span>
cognee/infrastructure/databases/vector/pgvector/create_db_and_tables.py (1)
3-3: Remove unused import.

The static analysis correctly identifies that context_vector_db_config is imported but never used.
-from cognee.context_global_variables import vector_db_config as context_vector_db_config
cognee/infrastructure/databases/graph/memgraph/memgraph_adapter.py (1)
638-644: Consider adding parameter validation for node_id.

For consistency with other methods in the class, consider adding validation for the node_id parameter to handle edge cases.
 async def get_node(self, node_id: str) -> Optional[Dict[str, Any]]:
     """Get a single node by ID."""
+    if not node_id:
+        return None
+        
     query = """
     MATCH (node {id: $node_id})
     RETURN node
     """
     results = await self.query(query, {"node_id": node_id})
     return results[0]["node"] if results else None
cognee/modules/ingestion/classify.py (2)
20-21: Refactor: Merge isinstance calls for better readability.

The static analysis suggestion is valid - merge the isinstance calls for cleaner code.
-    if isinstance(data, BufferedReader) or isinstance(data, SpooledTemporaryFile):
+    if isinstance(data, (BufferedReader, SpooledTemporaryFile)):
38-40: Improve error message specificity.

The error message could be more helpful by distinguishing between unsupported data types and missing s3fs installation.
-    raise IngestionError(
-        message=f"Type of data sent to classify(data: Union[str, BinaryIO) not supported or s3fs is not installed: {type(data)}"
-    )
+    if S3File is None:
+        raise IngestionError(
+            message=f"Unsupported data type {type(data)} or s3fs package not installed. Install s3fs for S3 file support."
+        )
+    else:
+        raise IngestionError(
+            message=f"Unsupported data type: {type(data)}. Supported types: str, BinaryIO, S3File"
+        )
cognee/modules/metrics/operations/get_pipeline_run_metrics.py (1)
26-26: Consider updating function signature and name for clarity.

The function now processes a single PipelineRunInfo but still returns a list. Consider either:

Returning a single GraphMetrics object (or None) instead of a list

Renaming the function to get_pipeline_run_metric (singular)

This would better reflect the current behavior and reduce confusion.
-async def get_pipeline_run_metrics(pipeline_run: PipelineRunInfo, include_optional: bool):
+async def get_pipeline_run_metric(pipeline_run: PipelineRunInfo, include_optional: bool) -> GraphMetrics | None:
And update the return logic:
-    metrics_for_pipeline_runs = []
-    # ... existing logic ...
-        if existing_metrics:
-            metrics_for_pipeline_runs.append(existing_metrics)
-        else:
-            # ... create metrics ...
-            metrics_for_pipeline_runs.append(metrics)
-    return metrics_for_pipeline_runs
+    # ... existing logic ...
+        if existing_metrics:
+            return existing_metrics
+        else:
+            # ... create metrics ...
+            session.add(metrics)
+            await session.commit()
+            return metrics
cognee-frontend/src/ui/elements/Select.tsx (2)
19-21: Consider accessibility improvements for the dropdown icon.

The icon implementation looks good, but consider adding accessibility attributes for better screen reader support.
-      <span className="pointer-events-none absolute top-1/2 -mt-0.5 right-3 text-indigo-600 rotate-180">
+      <span 
+        className="pointer-events-none absolute top-1/2 -mt-0.5 right-3 text-indigo-600 rotate-180"
+        aria-hidden="true"
+      >
8-14: Consider extracting base styles to a constant.

The Tailwind classes are quite long. Consider extracting them to improve maintainability.
+const DEFAULT_SELECT_CLASSES = "block w-full appearance-none rounded-md bg-white pl-4 pr-8 py-4 text-base text-gray-900 outline-1 -outline-offset-1 outline-gray-300 focus:outline-2 focus:-outline-offset-2 focus:outline-indigo-600";
+
 export default function Select({ children, className, ...props }: SelectHTMLAttributes<HTMLSelectElement>) {
   return (
     <div className="relative">
       <select
         className={
           classNames(
-            "block w-full appearance-none rounded-md bg-white pl-4 pr-8 py-4 text-base text-gray-900 outline-1 -outline-offset-1 outline-gray-300 focus:outline-2 focus:-outline-offset-2 focus:outline-indigo-600",
+            DEFAULT_SELECT_CLASSES,
             className,
           )
         }
cognee/modules/graph/methods/get_formatted_graph_data.py (3)
13-30: Consider refactoring lambda to named function for readability.

The lambda function for node transformation is complex and could benefit from being extracted to a separate function for better readability and testability.
+def format_node(node):
+    """Format a graph node for frontend consumption."""
+    node_id, node_data = node
+    return {
+        "id": str(node_id),
+        "label": node_data["name"] if ("name" in node_data and node_data["name"] != "") else f"{node_data['type']}_{str(node_id)}",
+        "type": node_data["type"],
+        "properties": {
+            key: value
+            for key, value in node_data.items()
+            if key not in ["id", "type", "name", "created_at", "updated_at"]
+            and value is not None
+        },
+    }
+
     return {
-        "nodes": list(
-            map(
-                lambda node: {
-                    "id": str(node[0]),
-                    "label": node[1]["name"]
-                    if ("name" in node[1] and node[1]["name"] != "")
-                    else f"{node[1]['type']}_{str(node[0])}",
-                    "type": node[1]["type"],
-                    "properties": {
-                        key: value
-                        for key, value in node[1].items()
-                        if key not in ["id", "type", "name", "created_at", "updated_at"]
-                        and value is not None
-                    },
-                },
-                nodes,
-            )
-        ),
+        "nodes": [format_node(node) for node in nodes],
31-40: Consider extracting edge formatting function as well.

For consistency, consider extracting the edge formatting logic to a separate function.
+def format_edge(edge):
+    """Format a graph edge for frontend consumption."""
+    return {
+        "source": str(edge[0]),
+        "target": str(edge[1]),
+        "label": edge[2],
+    }
+
-        "edges": list(
-            map(
-                lambda edge: {
-                    "source": str(edge[0]),
-                    "target": str(edge[1]),
-                    "label": edge[2],
-                },
-                edges,
-            )
-        ),
+        "edges": [format_edge(edge) for edge in edges],
24-24: Consider making excluded properties configurable.

The hardcoded list of excluded properties might need to be configurable for different use cases.
+EXCLUDED_PROPERTIES = {"id", "type", "name", "created_at", "updated_at"}
+
                        if key not in ["id", "type", "name", "created_at", "updated_at"]
+                        if key not in EXCLUDED_PROPERTIES
cognee/modules/data/methods/load_or_create_datasets.py (2)
22-24: Consider the static analysis suggestion for comparison optimization.

The static analysis suggestion to use identifier in (ds.name, ds.id) could improve readability, but the current approach is clearer about what's being compared. The performance difference is negligible for typical dataset counts.

The pylint suggestion is valid but optional here:
-        match = next(
-            (ds for ds in existing_datasets if ds.name == identifier or ds.id == identifier), None
-        )
+        match = next(
+            (ds for ds in existing_datasets if identifier in (ds.name, ds.id)), None
+        )
Both approaches are acceptable - the current one is slightly more readable.

19-41: Consider optimization: Early return for performance.

For better performance with large existing_datasets lists, consider using a dictionary lookup instead of linear search.
 async def load_or_create_datasets(
     dataset_names: List[Union[str, UUID]], existing_datasets: List[Dataset], user
 ) -> List[Dataset]:
     """
     Given a list of dataset identifiers (names or UUIDs), return Dataset instances:
       - If an identifier matches an existing Dataset (by name or id), reuse it.
       - Otherwise, create a new Dataset with a unique id. Note: Created dataset is not stored to database.
     """
+    # Create lookup dictionaries for O(1) access
+    datasets_by_name = {ds.name: ds for ds in existing_datasets}
+    datasets_by_id = {ds.id: ds for ds in existing_datasets}
+    
     result: List[Dataset] = []

     for identifier in dataset_names:
-        # Try to find a matching dataset in the existing list
-        # If no matching dataset is found return None
-        match = next(
-            (ds for ds in existing_datasets if ds.name == identifier or ds.id == identifier), None
-        )
+        # Try to find a matching dataset using dictionary lookup
+        match = datasets_by_name.get(identifier) or datasets_by_id.get(identifier)

         if match:
             result.append(match)
             continue
cognee/modules/data/methods/get_authorized_existing_datasets.py (1)
14-24: Complete the function docstring and add parameter validation.

The docstring is missing the permission_type parameter description, and there's no validation for the permission type parameter.
 async def get_authorized_existing_datasets(
     datasets: Union[list[str], list[UUID]], permission_type: str, user: User
 ) -> list[Dataset]:
     """
     Function returns a list of existing dataset objects user has access for based on datasets input.

     Args:
         datasets: List of dataset names or UUIDs to filter by
+        permission_type: Type of permission to check for (e.g., "read", "write")
         user: User object to check permissions for

     Returns:
         list of Dataset objects

     """
+    if not permission_type:
+        raise ValueError("permission_type cannot be empty")
cognee-frontend/src/modules/datasets/cognifyDataset.ts (1)
34-57: Remove commented code or document its purpose.

The large block of commented WebSocket code should either be removed if not needed or properly documented with TODO comments explaining the planned implementation.
-    // const websocket = new WebSocket(`ws://localhost:8000/api/v1/cognify/subscribe/${data.pipeline_run_id}`);
-
-    // let isCognifyDone = false;
-
-    // websocket.onmessage = (event) => {
-    //   const data = JSON.parse(event.data);
-    //   onUpdate?.({
-    //     nodes: data.payload.nodes,
-    //     edges: data.payload.edges,
-    //   });
-
-    //   if (data.status === "PipelineRunCompleted") {
-    //     isCognifyDone = true;
-    //     websocket.close();
-    //   }
-    // };
-
-    // return new Promise(async (resolve) => {
-    //   while (!isCognifyDone) {
-    //     await new Promise(resolve => setTimeout(resolve, 1000));
-    //   }
-
-    //   resolve(true);
-    // });
+  // TODO: Implement WebSocket subscription for real-time updates
+  // This will replace the current fetch-then-get pattern with streaming updates
cognee-frontend/src/app/(graph)/CogneeAddWidget.tsx (2)
103-103: Avoid hardcoded dataset name.

The hardcoded "main_dataset" name should be configurable or generated dynamically.
-    createDataset({ name: "main_dataset" })
+    createDataset({ name: `dataset_${Date.now()}` })
Or better yet, allow the user to specify the dataset name through UI input.

31-158: Consider breaking down component responsibilities.

This component handles too many concerns (dataset management, file uploads, modal state, search functionality). Consider extracting smaller components for better maintainability.

Consider extracting the following sub-components:

DatasetList for rendering the dataset list

FileUploadButton for file upload handling

SearchModal for the search functionality

This would improve testability and code reusability while following the Single Responsibility Principle.
cognee-frontend/src/utils/fetch.ts (1)
19-22: Consider using consistent fetch API and add timeout.

Using window.fetch for token refresh and global.fetch for API calls is inconsistent and could cause confusion. Also consider adding timeout protection.
-    return window.fetch("/auth/token")
+    return global.fetch("/auth/token", { credentials: "include" })
       .then(() => {
         return fetch(url, options, retryCount + 1);
       });
cognee-frontend/src/modules/chat/hooks/useChat.ts (2)
102-102: Improve type safety by replacing any types.

Using any types reduces type safety and makes the code harder to maintain. Consider creating proper interfaces for the system message types.
-// eslint-disable-next-line @typescript-eslint/no-explicit-any
-function convertToSearchTypeOutput(systemMessage: any[] | any, searchType: string): string {
+interface SystemMessage {
+  text?: string;
+  [key: string]: unknown;
+}
+
+function convertToSearchTypeOutput(systemMessage: SystemMessage[] | SystemMessage | string[], searchType: string): string {
48-80: Consider implementing request cancellation for concurrent operations.

The current implementation doesn't handle the case where a user sends multiple messages quickly. Previous requests should ideally be cancelled to prevent race conditions and outdated responses.

You could use AbortController to cancel previous requests:
+  const [abortController, setAbortController] = useState<AbortController | null>(null);

   const handleMessageSending = useCallback((message: string, searchType: string) => {
+    // Cancel previous request if still pending
+    if (abortController) {
+      abortController.abort();
+    }
+    
+    const newAbortController = new AbortController();
+    setAbortController(newAbortController);
+    
     const sentMessageId = v4();
     // ... rest of the logic
-    return sendMessage(message, searchType)
+    return sendMessage(message, searchType, { signal: newAbortController.signal })
       .finally(() => {
+        setAbortController(null);
         enableSearchRun();
       });
cognee-frontend/src/app/auth/AuthForm.tsx (2)
13-16: Expand error mapping for better user experience.

Consider adding more comprehensive error handling to cover additional authentication scenarios.
 const errorsMap = {
   LOGIN_BAD_CREDENTIALS: "Invalid username or password",
   REGISTER_USER_ALREADY_EXISTS: "User already exists",
+  NETWORK_ERROR: "Unable to connect to server. Please check your internet connection.",
+  SERVER_ERROR: "Server error occurred. Please try again later.",
+  VALIDATION_ERROR: "Please check your input and try again.",
 };
8-11: Strengthen TypeScript interface for better type safety.

The current interface extends HTMLFormElement but could be more specific about the expected form structure.
-interface AuthFormPayload extends HTMLFormElement {
-  email: HTMLInputElement;
-  password: HTMLInputElement;
-}
+interface AuthFormData {
+  email: string;
+  password: string;
+}
+
+interface AuthFormPayload extends HTMLFormElement {
+  readonly elements: HTMLFormControlsCollection & {
+    email: HTMLInputElement;
+    password: HTMLInputElement;
+  };
+}
cognee/modules/users/get_user_manager.py (1)
23-36: Remove commented code or document why it's preserved.

Large blocks of commented code should either be removed or have clear documentation explaining why they're being preserved.

If this code is no longer needed, remove it entirely:
-    # async def get(self, id: models.ID) -> models.UP:
-    #     """
-    #     Get a user by id.
-
-    #     :param id: Id. of the user to retrieve.
-    #     :raises UserNotExists: The user does not exist.
-    #     :return: A user.
-    #     """
-    #     user = await get_user(id)
-
-    #     if user is None:
-    #         raise UserNotExists()
-
-    #     return user
If it needs to be preserved for future use, add a comment explaining why.
cognee/modules/pipelines/queues/pipeline_run_info_queues.py (1)
22-23: Add error handling for missing queue removal.

The remove_queue function will raise a KeyError if the pipeline run ID doesn't exist in the dictionary.
 def remove_queue(pipeline_run_id: UUID):
-    pipeline_run_info_queues.pop(str(pipeline_run_id))
+    pipeline_run_info_queues.pop(str(pipeline_run_id), None)
cognee-frontend/src/ui/elements/TextArea.tsx (1)
23-30: Optimize callback dependencies to prevent unnecessary re-renders.

The handleTextChange callback includes value in dependencies, which could cause unnecessary re-renders when the value changes frequently.
 const handleTextChange = useCallback((event: Event) => {
   const fakeTextAreaElement = event.target as HTMLDivElement;
   const newValue = fakeTextAreaElement.innerText;

-  if (newValue !== value) {
     onChange?.(newValue);
-  }
-}, [onChange, value]);
+}, [onChange]);
cognee-mcp/src/test_client.py (2)
123-127: Simplify nested context managers.

The nested with statements can be combined into a single statement for better readability.
-        async with stdio_client(server_params) as (read, write):
-            async with ClientSession(read, write) as session:
+        async with stdio_client(server_params) as (read, write), \
+                   ClientSession(read, write) as session:
                 # Initialize the session
                 await session.initialize()
                 yield session
308-326: Review search test logic for potential issues.

The search test breaks out of the loop when encountering NATURAL_LANGUAGE or CYPHER search types, but this might skip valid search types that come after them in the enum.
         # Go through all Cognee search types
         for search_type in SearchType:
             # Don't test these search types
             if search_type in [SearchType.NATURAL_LANGUAGE, SearchType.CYPHER]:
-                break
+                continue
cognee-frontend/src/ui/Partials/SignInForm/SignInForm.tsx (1)
51-71: Consider improving form accessibility.

The form could benefit from additional accessibility features like ARIA attributes and better error association.
-    <form onSubmit={signIn} className="flex flex-col gap-2">
+    <form onSubmit={signIn} className="flex flex-col gap-2" role="form" aria-label="Sign in form">
       <div className="flex flex-col gap-2">
         <div className="mb-4">
           <label className="block mb-2" htmlFor="email">Email</label>
-          <Input id="email" name="email" type="email" placeholder="Your email address" />
+          <Input 
+            id="email" 
+            name="email" 
+            type="email" 
+            placeholder="Your email address"
+            required
+            aria-describedby={signInError ? "signin-error" : undefined}
+          />
         </div>
         <div className="mb-4">
           <label className="block mb-2" htmlFor="password">Password</label>
-          <Input id="password" name="password" type="password" placeholder="Your password" />
+          <Input 
+            id="password" 
+            name="password" 
+            type="password" 
+            placeholder="Your password"
+            required
+            aria-describedby={signInError ? "signin-error" : undefined}
+          />
         </div>
       </div>

       <CTAButton type="submit">
         {submitButtonText}
         {isSigningIn && <LoadingIndicator />}
       </CTAButton>

       {signInError && (
-        <span className="text-s text-white">{signInError}</span>
+        <span id="signin-error" className="text-s text-white" role="alert" aria-live="polite">
+          {signInError}
+        </span>
       )}
     </form>
cognee/api/v1/search/search.py (2)
24-25: Merge isinstance calls for better performance.

The multiple isinstance calls can be combined into a single call for better readability and performance.
-    if isinstance(datasets, UUID) or isinstance(datasets, str):
+    if isinstance(datasets, (UUID, str)):
         datasets = [datasets]
12-22: Consider reducing function parameters using a configuration object.

The function has 9 parameters, which exceeds recommended limits and makes it harder to maintain. Consider using a search configuration object.
+from dataclasses import dataclass
+from typing import Union, Optional, List, Type
+
+@dataclass
+class SearchConfig:
+    query_text: str
+    query_type: SearchType = SearchType.GRAPH_COMPLETION
+    user: User = None
+    datasets: Optional[Union[list[str], str]] = None
+    dataset_ids: Optional[Union[list[UUID], UUID]] = None
+    system_prompt_path: str = "answer_simple_question.txt"
+    top_k: int = 10
+    node_type: Optional[Type] = None
+    node_name: Optional[List[str]] = None

-async def search(
-    query_text: str,
-    query_type: SearchType = SearchType.GRAPH_COMPLETION,
-    user: User = None,
-    datasets: Optional[Union[list[str], str]] = None,
-    dataset_ids: Optional[Union[list[UUID], UUID]] = None,
-    system_prompt_path: str = "answer_simple_question.txt",
-    top_k: int = 10,
-    node_type: Optional[Type] = None,
-    node_name: Optional[List[str]] = None,
-) -> list:
+async def search(config: SearchConfig) -> list:
+    # Extract parameters from config
+    query_text = config.query_text
+    query_type = config.query_type
+    user = config.user
+    datasets = config.datasets
+    dataset_ids = config.dataset_ids
+    # ... etc
cognee/modules/pipelines/models/PipelineRunInfo.py (2)
16-18: Consider using class-level default for status field.

The hardcoded status string works but could be improved for consistency and maintainability.

Apply this pattern to make the status field more maintainable:
 class PipelineRunStarted(PipelineRunInfo):
-    status: str = "PipelineRunStarted"
-    pass
+    status: str = "PipelineRunStarted"
The pass statement is unnecessary when you have field definitions.

21-33: Apply consistent pattern to remaining subclasses.

The same improvements apply to all subclasses - remove unnecessary pass statements since they have field definitions.
 class PipelineRunYield(PipelineRunInfo):
     status: str = "PipelineRunYield"
-    pass

 class PipelineRunCompleted(PipelineRunInfo):
     status: str = "PipelineRunCompleted"
-    pass

 class PipelineRunErrored(PipelineRunInfo):
     status: str = "PipelineRunErrored"
-    pass
Note: The pylint warnings about "too few public methods" are false positives - data models typically don't need multiple public methods.
cognee/infrastructure/databases/vector/qdrant/QDrantAdapter.py (1)
156-159: Simplify conditional structure.

The elif after return is unnecessary and can be simplified.
         if self.qdrant_path is not None:
             return AsyncQdrantClient(path=self.qdrant_path, port=6333, https=is_prod)
-        elif self.url is not None:
+        if self.url is not None:
             return AsyncQdrantClient(url=self.url, api_key=self.api_key, port=6333, https=is_prod)
cognee-frontend/src/app/(graph)/GraphView.tsx (1)

30-32: Unused state variable.

The isAddNodeFormOpen value is extracted but never used elsewhere in the component, except being passed to GraphControls. Consider if this is needed.

If this state is only needed for passing to child components, the current implementation is fine. Otherwise, consider removing the unused destructuring.
cognee-starter-kit/README.md (1)
84-84: Convert bare URL to proper markdown link.
-- create an account and API key from https://www.graphistry.com
+- create an account and API key from [Graphistry](https://www.graphistry.com)
cognee/context_global_variables.py (1)
35-35: Simplify boolean condition check.

The current condition is unnecessarily complex and can be simplified for better readability.
-    if not os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() == "true":
+    if os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() != "true":
cognee-frontend/src/app/(graph)/GraphVisualization.tsx (3)
9-9: Fix typo in interface name.

There's a typo in the interface name that should be corrected for consistency.
-interface GraphVisuzaliationProps {
+interface GraphVisualizationProps {
Also update the usage on line 20:
-export default function GraphVisualization({ ref, data, graphControls }: GraphVisuzaliationProps) {
+export default function GraphVisualization({ ref, data, graphControls }: GraphVisualizationProps) {
23-56: Consider removing or documenting commented code.

There's a substantial amount of commented-out code for node addition functionality. If this is planned future functionality, consider moving it to a separate branch or documenting its purpose. If it's no longer needed, it should be removed to improve code readability.

58-102: Clean up commented code in render function.

Similar to the previous comment, the renderNode function contains significant commented-out code that affects readability. Consider removing unused code or documenting its future purpose.
cognee/modules/users/methods/get_authenticated_user.py (1)
8-48: Clean up commented authentication code.

Since the migration to FastAPI Users is complete, consider removing the commented-out manual authentication code to improve code readability and reduce maintenance burden.
-# from types import SimpleNamespace
-
-# from ..get_fastapi_users import get_fastapi_users
-# from fastapi import HTTPException, Security
-# from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
-# import os
-# import jwt
-
-# from uuid import UUID
-
-# fastapi_users = get_fastapi_users()
-
-# # Allows Swagger to understand authorization type and allow single sign on for the Swagger docs to test backend
-# bearer_scheme = HTTPBearer(scheme_name="BearerAuth", description="Paste **Bearer &lt;JWT&gt;**")
-
-
-# async def get_authenticated_user(
-#     creds: HTTPAuthorizationCredentials = Security(bearer_scheme),
-# ) -> SimpleNamespace:
-#     """
-#     Extract and validate the JWT presented in the Authorization header.
-#     """
-#     if creds is None:  # header missing
-#         raise HTTPException(status_code=401, detail="Not authenticated")
-
-#     if creds.scheme.lower() != "bearer":  # shouldn't happen extra guard
-#         raise HTTPException(status_code=401, detail="Invalid authentication scheme")
-
-#     token = creds.credentials
-#     try:
-#         payload = jwt.decode(
-#             token, os.getenv("FASTAPI_USERS_JWT_SECRET", "super_secret"), algorithms=["HS256"]
-#         )
-
-#         auth_data = SimpleNamespace(id=UUID(payload["user_id"]))
-#         return auth_data
-
-#     except jwt.ExpiredSignatureError:
-#         raise HTTPException(status_code=401, detail="Token has expired")
-#     except jwt.InvalidTokenError:
-#         raise HTTPException(status_code=401, detail="Invalid token")
cognee-mcp/README.md (2)
52-82: Add language specifications to code blocks for better syntax highlighting.

Multiple code blocks are missing language specifications, which affects readability and syntax highlighting.

Apply these fixes to improve code block formatting:
-    ```
+    ```bash
     git clone https://github.com/topoteretes/cognee.git
-    ```
+    ```

-    ```
+    ```bash
     cd cognee/cognee-mcp
-    ```
+    ```

-    ```
+    ```bash
     pip install uv
-    ```
+    ```

-    ```
+    ```bash
     uv sync --dev --all-extras --reinstall
-    ```
+    ```

-    ```
+    ```bash
     source .venv/bin/activate
-    ```
+    ```

-    ```
+    ```env
     LLM_API_KEY="YOUR_OPENAI_API_KEY"
-    ```
+    ```

-    ```
+    ```bash
     python src/server.py
-    ```
+    ```

-    ```
+    ```bash
     python src/server.py --transport sse
-    ```
+    ```
169-177: Add language specifications to remaining code blocks.

The development section also has code blocks without language specifications.

Apply these fixes:
-    ```
+    ```toml
     #"cognee[postgres,codegraph,gemini,huggingface,docs,neo4j] @ file:/Users/<username>/Desktop/cognee"
-    ```
+    ```

-    ```
+    ```bash
     uv sync --reinstall
-    ```
+    ```
cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py (1)
5-5: Remove unused imports.

Static analysis correctly identified unused imports that should be removed for cleaner code.

Apply this fix:
-from typing import Dict, Any, List, Optional, Tuple
+from typing import List, Optional, Tuple
cognee/api/v1/add/routers/get_add_router.py (1)

39-43: Complete the TODO items for dataset integration.

The TODO comments indicate incomplete dataset integration in the GitHub clone and URL fetch logic. These code paths don't utilize the provided dataset parameters, which could lead to data being added to the wrong dataset.

Would you like me to help implement the dataset integration for these code paths or create an issue to track this work?

Also applies to: 50-51
cognee-frontend/src/app/(graph)/CrewAITrigger.tsx (2)
13-16: Improve TypeScript type safety.

The use of any types reduces type safety. Consider defining proper interfaces for the data structures.
-  // eslint-disable-next-line @typescript-eslint/no-explicit-any
-  onData: (data: any) => void;
-  // eslint-disable-next-line @typescript-eslint/no-explicit-any
-  onActivity: (activities: any) => void;
+  onData: (data: { nodes: NodeData[]; links: LinkData[] } | null) => void;
+  onActivity: (activities: Activity[]) => void;
Define the missing interfaces:
interface NodeData {
  id: string;
  type: string;
  // other node properties
}

interface LinkData {
  source: string;
  target: string;
  // other link properties
}

interface Activity {
  id: string;
  timestamp: number;
  activity: string;
}
91-92: Ensure WebSocket cleanup in error scenarios.

The WebSocket is closed in the finally block, but there's a potential race condition if the WebSocket onmessage handler closes it first. Consider adding a flag to prevent double closure.
+  let websocketClosed = false;
   
   websocket.onmessage = (event) => {
     // ... existing code ...
     if (data.status === "PipelineRunCompleted") {
+      websocketClosed = true;
       websocket.close();
     }
   };

   // ... in finally block ...
   .finally(() => {
-    websocket.close();
+    if (!websocketClosed) {
+      websocket.close();
+    }
     setIsCrewAIRunning(false);
   });
cognee-frontend/src/ui/Partials/SearchView/SearchView.tsx (2)
21-26: Consider making dataset configuration dynamic.

The hardcoded MAIN_DATASET limits flexibility for multi-dataset scenarios mentioned in the PR objectives. Consider making this configurable via props or context.
-const MAIN_DATASET = {
-  id: "",
-  data: [],
-  status: "",
-  name: "main_dataset",
-};
+interface SearchViewProps {
+  dataset?: Dataset;
+}

-export default function SearchView() {
+export default function SearchView({ dataset = DEFAULT_DATASET }: SearchViewProps) {
   // ... use dataset prop instead of MAIN_DATASET
130-134: Consider improving accessibility for the select element.

The native select element could benefit from better accessibility attributes and consistent styling with other form elements.
-              <Select name="searchType" defaultValue={searchOptions[0].value} className="max-w-2xs">
+              <Select 
+                name="searchType" 
+                defaultValue={searchOptions[0].value} 
+                className="max-w-2xs"
+                aria-label="Search type selection"
+              >
cognee-frontend/src/app/(graph)/GraphControls.tsx (2)
134-135: Potential DOM manipulation issue.

Direct DOM manipulation using getElementById can be unreliable in React. Consider using a ref or controlled component pattern instead.
+  const graphShapeSelectRef = useRef<HTMLSelectElement>(null);

   // In the timeout callback:
-        const graphShapeSelectElement = document.getElementById("graph-shape-select") as HTMLSelectElement;
-        graphShapeSelectElement.value = newValue;
+        if (graphShapeSelectRef.current) {
+          graphShapeSelectRef.current.value = newValue;
+        }

   // In the JSX:
-            <Select defaultValue={DEFAULT_GRAPH_SHAPE} onChange={handleGraphShapeControl} id="graph-shape-select" className="flex-2/5">
+            <Select ref={graphShapeSelectRef} defaultValue={DEFAULT_GRAPH_SHAPE} onChange={handleGraphShapeControl} className="flex-2/5">
138-138: Fix the timeout type casting.

The type casting as unknown as number indicates a TypeScript issue. In browser environments, setTimeout returns a number, but Node.js returns a NodeJS.Timeout. Consider using a more robust approach.
-      shapeChangeTimeout.current = setTimeout(() => {
+      shapeChangeTimeout.current = window.setTimeout(() => {
         // ... callback code ...
-      }, 5000) as unknown as number;
+      }, 5000);
And update the ref type:
-  const shapeChangeTimeout = useRef<number | null>();
+  const shapeChangeTimeout = useRef<number | null>(null);
cognee/modules/pipelines/operations/pipeline.py (1)
76-77: Merge isinstance calls for better performance.

The static analysis correctly identifies this optimization opportunity.
-if isinstance(datasets, str) or isinstance(datasets, UUID):
+if isinstance(datasets, (str, UUID)):
cognee/modules/search/methods/search.py (2)
43-57: Improve docstring completeness.

The docstring is incomplete with placeholder text. Consider providing comprehensive documentation for this critical function.
+"""
+Search function with optional permission-aware filtering.
+
+Args:
+    query_text: The search query string
+    query_type: Type of search to perform (SearchType enum)
+    dataset_ids: Optional list of dataset UUIDs to search within
+    user: User performing the search
+    system_prompt_path: Path to system prompt file
+    top_k: Maximum number of results to return
+    node_type: Optional node type filter for graph searches
+    node_name: Optional node name filter for graph searches
+
+Returns:
+    List of search results, format depends on query_type
+
+Notes:
+    When ENABLE_BACKEND_ACCESS_CONTROL=true, searches are filtered by user permissions
+"""
59-62: Consider environment variable caching.

Repeated environment variable lookups can be optimized by caching the value.
+# Cache at module level
+_ACCESS_CONTROL_ENABLED = os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() == "true"

-if os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() == "true":
+if _ACCESS_CONTROL_ENABLED:
cognee/api/v1/permissions/routers/get_permissions_router.py (1)
25-26: Optimize list comprehension.

The static analysis correctly identifies an unnecessary comprehension.
-[dataset_id for dataset_id in dataset_ids],
+list(dataset_ids),
cognee/api/v1/cognify/cognify.py (1)
44-59: Consider removing unnecessary else clause.

The else clause after return is unnecessary as suggested by Pylint.
-    if run_in_background:
-        return await run_cognify_as_background_process(
-            tasks=tasks,
-            user=user,
-            datasets=datasets,
-            vector_db_config=vector_db_config,
-            graph_db_config=graph_db_config,
-        )
-    else:
-        return await run_cognify_blocking(
-            tasks=tasks,
-            user=user,
-            datasets=datasets,
-            vector_db_config=vector_db_config,
-            graph_db_config=graph_db_config,
-        )
+    if run_in_background:
+        return await run_cognify_as_background_process(
+            tasks=tasks,
+            user=user,
+            datasets=datasets,
+            vector_db_config=vector_db_config,
+            graph_db_config=graph_db_config,
+        )
+    
+    return await run_cognify_blocking(
+        tasks=tasks,
+        user=user,
+        datasets=datasets,
+        vector_db_config=vector_db_config,
+        graph_db_config=graph_db_config,
+    )
cognee/api/v1/cognify/routers/get_cognify_router.py (1)
76-77: Consider simplifying nested context managers.

As suggested by Ruff, multiple contexts can be combined into a single statement.
-            async with db_engine.get_async_session() as session:
-                async with get_user_db_context(session) as user_db:
+            async with (
+                db_engine.get_async_session() as session,
+                get_user_db_context(session) as user_db,
+            ):
cognee/modules/ontology/rdf_xml/OntologyResolver.py (1)

126-202: Consider refactoring get_subgraph method to reduce complexity.

The method works correctly but has high cyclomatic complexity. The BFS traversal logic and RDF triple pattern matching are sound, but the method could benefit from being broken down into smaller functions.

Consider extracting helper methods like:

_process_individual_relations(current, visited, queue, nodes_set, edges)

_process_subclass_relations(current, visited, queue, nodes_set, edges)

_process_object_properties(current, obj_props, visited, queue, nodes_set, edges, directed)

This would improve readability and make the method easier to test and maintain.
cognee/api/v1/datasets/routers/get_datasets_router.py (1)
153-170: Graph endpoint returns structured data correctly.

The modified endpoint now returns structured GraphDTO data instead of string URLs, which is more useful for frontend consumption. However, consider improving error handling specificity.

Consider catching specific exceptions instead of the generic Exception and returning more specific error responses:
-        except Exception:
+        except (DatasetNotFoundError, DataNotFoundError) as e:
             return JSONResponse(
-                status_code=409,
-                content="Error retrieving dataset graph data.",
+                status_code=404,
+                content={"detail": str(e)},
+            )
+        except Exception as e:
+            logger.error(f"Unexpected error retrieving dataset graph: {str(e)}")
+            return JSONResponse(
+                status_code=500,
+                content={"detail": "Internal server error retrieving dataset graph."},
             )

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 836b5f8 and 43c14e7.

⛔ Files ignored due to path filters (27)

.dlt/config.toml is excluded by !**/*.toml
.github/workflows/backend_docker_build_test.yml is excluded by !**/*.yml
.github/workflows/cd.yaml is excluded by !**/*.yaml
.github/workflows/cd_prd.yaml is excluded by !**/*.yaml
.github/workflows/e2e_tests.yml is excluded by !**/*.yml
.github/workflows/test_gemini.yml is excluded by !**/*.yml
.github/workflows/test_mcp.yml is excluded by !**/*.yml
.github/workflows/test_memgraph.yml is excluded by !**/*.yml
.github/workflows/test_suites.yml is excluded by !**/*.yml
.github/workflows/vector_db_tests.yml is excluded by !**/*.yml
cognee-frontend/package-lock.json is excluded by !**/package-lock.json, !**/*.json
cognee-frontend/package.json is excluded by !**/*.json
cognee-frontend/public/images/cognee-logo-with-text.png is excluded by !**/*.png, !**/*.png
cognee-frontend/public/images/crewai.png is excluded by !**/*.png, !**/*.png
cognee-frontend/public/images/deepnote.svg is excluded by !**/*.svg, !**/*.svg
cognee-frontend/public/images/lancedb.svg is excluded by !**/*.svg, !**/*.svg
cognee-frontend/public/images/neo4j.png is excluded by !**/*.png, !**/*.png
cognee-frontend/src/app/(graph)/example_data.json is excluded by !**/*.json
cognee-mcp/pyproject.toml is excluded by !**/*.toml
cognee-starter-kit/pyproject.toml is excluded by !**/*.toml
cognee-starter-kit/src/data/companies.json is excluded by !**/*.json
cognee-starter-kit/src/data/people.json is excluded by !**/*.json
evals/comparative_eval/hotpot_50_corpus.json is excluded by !**/*.json
evals/comparative_eval/hotpot_50_qa_pairs.json is excluded by !**/*.json
poetry.lock is excluded by !**/*.lock, !**/*.lock
pyproject.toml is excluded by !**/*.toml
uv.lock is excluded by !**/*.lock, !**/*.lock

📒 Files selected for processing (207)

.env.template (2 hunks)
CONTRIBUTING.md (1 hunks)
README.md (1 hunks)
alembic/versions/ab7e313804ae_permission_system_rework.py (1 hunks)
cognee-frontend/.prettierignore (1 hunks)
cognee-frontend/.prettierrc (1 hunks)
cognee-frontend/Dockerfile (2 hunks)
cognee-frontend/eslint.config.mjs (1 hunks)
cognee-frontend/postcss.config.mjs (1 hunks)
cognee-frontend/src/app/(graph)/ActivityLog.tsx (1 hunks)
cognee-frontend/src/app/(graph)/CogneeAddWidget.tsx (1 hunks)
cognee-frontend/src/app/(graph)/CrewAITrigger.tsx (1 hunks)
cognee-frontend/src/app/(graph)/GraphControls.tsx (1 hunks)
cognee-frontend/src/app/(graph)/GraphLegend.tsx (1 hunks)
cognee-frontend/src/app/(graph)/GraphView.tsx (1 hunks)
cognee-frontend/src/app/(graph)/GraphVisualization.tsx (1 hunks)
cognee-frontend/src/app/(graph)/getColorForNodeType.ts (1 hunks)
cognee-frontend/src/app/auth/AuthForm.tsx (1 hunks)
cognee-frontend/src/app/auth/AuthPage.module.css (0 hunks)
cognee-frontend/src/app/auth/AuthPage.tsx (1 hunks)
cognee-frontend/src/app/auth/layout.tsx (1 hunks)
cognee-frontend/src/app/auth/login/LoginPage.tsx (1 hunks)
cognee-frontend/src/app/auth/login/page.tsx (1 hunks)
cognee-frontend/src/app/auth/page.tsx (1 hunks)
cognee-frontend/src/app/auth/signup/SignUpPage.tsx (1 hunks)
cognee-frontend/src/app/auth/signup/page.tsx (1 hunks)
cognee-frontend/src/app/auth/token/route.ts (1 hunks)
cognee-frontend/src/app/globals.css (1 hunks)
cognee-frontend/src/app/layout.tsx (1 hunks)
cognee-frontend/src/app/page.tsx (1 hunks)
cognee-frontend/src/app/wizard/AddStep/AddStep.module.css (0 hunks)
cognee-frontend/src/app/wizard/AddStep/AddStep.tsx (0 hunks)
cognee-frontend/src/app/wizard/AddStep/index.ts (0 hunks)
cognee-frontend/src/app/wizard/CognifyStep/CognifyStep.tsx (0 hunks)
cognee-frontend/src/app/wizard/CognifyStep/index.ts (0 hunks)
cognee-frontend/src/app/wizard/ConfigStep/ConfigStep.tsx (0 hunks)
cognee-frontend/src/app/wizard/ConfigStep/index.ts (0 hunks)
cognee-frontend/src/app/wizard/ExploreStep/ExploreStep.tsx (0 hunks)
cognee-frontend/src/app/wizard/ExploreStep/index.ts (0 hunks)
cognee-frontend/src/app/wizard/WizardPage.module.css (0 hunks)
cognee-frontend/src/app/wizard/WizardPage.tsx (0 hunks)
cognee-frontend/src/app/wizard/page.tsx (0 hunks)
cognee-frontend/src/middleware.ts (1 hunks)
cognee-frontend/src/modules/auth/auth0.ts (1 hunks)
cognee-frontend/src/modules/chat/api/getHistory.ts (1 hunks)
cognee-frontend/src/modules/chat/hooks/useChat.ts (1 hunks)
cognee-frontend/src/modules/datasets/cognifyDataset.ts (1 hunks)
cognee-frontend/src/modules/datasets/createDataset.ts (1 hunks)
cognee-frontend/src/modules/datasets/getDatasetGraph.ts (1 hunks)
cognee-frontend/src/modules/exploration/getExplorationGraphUrl.ts (1 hunks)
cognee-frontend/src/modules/ingestion/DataView/DataView.module.css (0 hunks)
cognee-frontend/src/modules/ingestion/DataView/DataView.tsx (0 hunks)
cognee-frontend/src/modules/ingestion/DataView/RawDataPreview.module.css (0 hunks)
cognee-frontend/src/modules/ingestion/DataView/RawDataPreview.tsx (0 hunks)
cognee-frontend/src/modules/ingestion/DataView/index.ts (0 hunks)
cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.module.css (0 hunks)
cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.tsx (0 hunks)
cognee-frontend/src/modules/ingestion/DatasetsView/StatusIcon.tsx (0 hunks)
cognee-frontend/src/modules/ingestion/DatasetsView/index.ts (0 hunks)
cognee-frontend/src/modules/ingestion/addData.ts (1 hunks)
cognee-frontend/src/modules/ingestion/useDatasets.ts (3 hunks)
cognee-frontend/src/ui/App/Loading/DefaultLoadingIndicator/LoadingIndicator.module.css (1 hunks)
cognee-frontend/src/ui/Icons/AddIcon.tsx (1 hunks)
cognee-frontend/src/ui/Icons/CaretIcon.tsx (1 hunks)
cognee-frontend/src/ui/Icons/DeleteIcon.tsx (1 hunks)
cognee-frontend/src/ui/Icons/GitHubIcon.tsx (1 hunks)
cognee-frontend/src/ui/Icons/SearchIcon.tsx (1 hunks)
cognee-frontend/src/ui/Icons/index.ts (1 hunks)
cognee-frontend/src/ui/Partials/Explorer/Explorer.module.css (0 hunks)
cognee-frontend/src/ui/Partials/Explorer/Explorer.tsx (0 hunks)
cognee-frontend/src/ui/Partials/FeedbackForm.tsx (1 hunks)
cognee-frontend/src/ui/Partials/Footer/Footer.module.css (0 hunks)
cognee-frontend/src/ui/Partials/Footer/Footer.tsx (1 hunks)
cognee-frontend/src/ui/Partials/SearchView/SearchView.module.css (1 hunks)
cognee-frontend/src/ui/Partials/SearchView/SearchView.tsx (1 hunks)
cognee-frontend/src/ui/Partials/SettingsModal/Settings.tsx (1 hunks)
cognee-frontend/src/ui/Partials/SettingsModal/SettingsModal.tsx (1 hunks)
cognee-frontend/src/ui/Partials/SignInForm/SignInForm.tsx (3 hunks)
cognee-frontend/src/ui/Partials/Wizard/WizardContent/WizardContent.module.css (0 hunks)
cognee-frontend/src/ui/Partials/Wizard/WizardContent/WizardContent.tsx (0 hunks)
cognee-frontend/src/ui/Partials/Wizard/WizardHeading.tsx (0 hunks)
cognee-frontend/src/ui/Partials/Wizard/index.ts (0 hunks)
cognee-frontend/src/ui/Partials/index.ts (1 hunks)
cognee-frontend/src/ui/elements/CTAButton.tsx (1 hunks)
cognee-frontend/src/ui/elements/GhostButton.tsx (1 hunks)
cognee-frontend/src/ui/elements/Input.tsx (1 hunks)
cognee-frontend/src/ui/elements/Modal.tsx (1 hunks)
cognee-frontend/src/ui/elements/NeutralButton.tsx (1 hunks)
cognee-frontend/src/ui/elements/Select.tsx (1 hunks)
cognee-frontend/src/ui/elements/StatusIndicator.tsx (1 hunks)
cognee-frontend/src/ui/elements/TextArea.tsx (1 hunks)
cognee-frontend/src/ui/elements/index.ts (1 hunks)
cognee-frontend/src/utils/fetch.ts (1 hunks)
cognee-frontend/src/utils/handleServerErrors.ts (1 hunks)
cognee-frontend/src/utils/index.ts (1 hunks)
cognee-frontend/src/utils/useBoolean.ts (1 hunks)
cognee-frontend/types/d3-force-3d.d.ts (1 hunks)
cognee-mcp/README.md (2 hunks)
cognee-mcp/src/server.py (6 hunks)
cognee-mcp/src/test_client.py (1 hunks)
cognee-starter-kit/.env.template (1 hunks)
cognee-starter-kit/.gitignore (1 hunks)
cognee-starter-kit/README.md (1 hunks)
cognee-starter-kit/src/pipelines/custom-model.py (1 hunks)
cognee-starter-kit/src/pipelines/default.py (1 hunks)
cognee-starter-kit/src/pipelines/low_level.py (1 hunks)
cognee/__init__.py (1 hunks)
cognee/api/client.py (3 hunks)
cognee/api/v1/add/add.py (2 hunks)
cognee/api/v1/add/routers/get_add_router.py (4 hunks)
cognee/api/v1/cognify/code_graph_pipeline.py (3 hunks)
cognee/api/v1/cognify/cognify.py (3 hunks)
cognee/api/v1/cognify/routers/get_cognify_router.py (1 hunks)
cognee/api/v1/datasets/routers/get_datasets_router.py (5 hunks)
cognee/api/v1/delete/exceptions.py (1 hunks)
cognee/api/v1/delete/routers/get_delete_router.py (1 hunks)
cognee/api/v1/permissions/routers/get_permissions_router.py (1 hunks)
cognee/api/v1/search/routers/get_search_router.py (3 hunks)
cognee/api/v1/search/search.py (1 hunks)
cognee/api/v1/settings/routers/get_settings_router.py (1 hunks)
cognee/api/v1/users/routers/get_auth_router.py (1 hunks)
cognee/api/v1/users/routers/get_visualize_router.py (1 hunks)
cognee/api/v1/visualize/visualize.py (2 hunks)
cognee/context_global_variables.py (1 hunks)
cognee/eval_framework/analysis/metrics_calculator.py (1 hunks)
cognee/eval_framework/corpus_builder/corpus_builder_executor.py (1 hunks)
cognee/eval_framework/corpus_builder/task_getters/get_cascade_graph_tasks.py (2 hunks)
cognee/eval_framework/evaluation/deep_eval_adapter.py (3 hunks)
cognee/exceptions/exceptions.py (1 hunks)
cognee/fetch_secret.py (0 hunks)
cognee/infrastructure/databases/graph/config.py (1 hunks)
cognee/infrastructure/databases/graph/get_graph_engine.py (3 hunks)
cognee/infrastructure/databases/graph/kuzu/adapter.py (3 hunks)
cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py (1 hunks)
cognee/infrastructure/databases/graph/kuzu/show_remote_kuzu_stats.py (1 hunks)
cognee/infrastructure/databases/graph/memgraph/memgraph_adapter.py (2 hunks)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (20 hunks)
cognee/infrastructure/databases/graph/neo4j_driver/neo4j_metrics_utils.py (2 hunks)
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1 hunks)
cognee/infrastructure/databases/utils/__init__.py (1 hunks)
cognee/infrastructure/databases/utils/get_or_create_dataset_database.py (1 hunks)
cognee/infrastructure/databases/vector/config.py (1 hunks)
cognee/infrastructure/databases/vector/create_vector_engine.py (2 hunks)
cognee/infrastructure/databases/vector/get_vector_engine.py (1 hunks)
cognee/infrastructure/databases/vector/lancedb/LanceDBAdapter.py (1 hunks)
cognee/infrastructure/databases/vector/pgvector/create_db_and_tables.py (1 hunks)
cognee/infrastructure/databases/vector/qdrant/QDrantAdapter.py (4 hunks)
cognee/modules/data/exceptions/__init__.py (1 hunks)
cognee/modules/data/exceptions/exceptions.py (1 hunks)
cognee/modules/data/methods/__init__.py (1 hunks)
cognee/modules/data/methods/check_dataset_name.py (1 hunks)
cognee/modules/data/methods/create_dataset.py (0 hunks)
cognee/modules/data/methods/get_authorized_existing_datasets.py (1 hunks)
cognee/modules/data/methods/get_dataset_ids.py (1 hunks)
cognee/modules/data/methods/get_unique_dataset_id.py (1 hunks)
cognee/modules/data/methods/load_or_create_datasets.py (1 hunks)
cognee/modules/data/models/Data.py (0 hunks)
cognee/modules/data/models/Dataset.py (1 hunks)
cognee/modules/data/processing/document_types/open_data_file.py (1 hunks)
cognee/modules/graph/methods/__init__.py (1 hunks)
cognee/modules/graph/methods/get_formatted_graph_data.py (1 hunks)
cognee/modules/graph/utils/expand_with_nodes_and_edges.py (4 hunks)
cognee/modules/ingestion/classify.py (1 hunks)
cognee/modules/ingestion/data_types/__init__.py (0 hunks)
cognee/modules/metrics/operations/get_pipeline_run_metrics.py (2 hunks)
cognee/modules/ontology/rdf_xml/OntologyResolver.py (3 hunks)
cognee/modules/pipelines/methods/__init__.py (1 hunks)
cognee/modules/pipelines/methods/get_pipeline_run.py (1 hunks)
cognee/modules/pipelines/models/PipelineRunInfo.py (1 hunks)
cognee/modules/pipelines/models/__init__.py (1 hunks)
cognee/modules/pipelines/operations/log_pipeline_run_initiated.py (1 hunks)
cognee/modules/pipelines/operations/log_pipeline_run_start.py (2 hunks)
cognee/modules/pipelines/operations/pipeline.py (6 hunks)
cognee/modules/pipelines/operations/run_tasks.py (2 hunks)
cognee/modules/pipelines/queues/pipeline_run_info_queues.py (1 hunks)
cognee/modules/pipelines/utils/__init__.py (1 hunks)
cognee/modules/pipelines/utils/generate_pipeline_id.py (1 hunks)
cognee/modules/pipelines/utils/generate_pipeline_run_id.py (1 hunks)
cognee/modules/retrieval/exceptions/exceptions.py (1 hunks)
cognee/modules/retrieval/graph_completion_retriever.py (3 hunks)
cognee/modules/retrieval/utils/brute_force_triplet_search.py (1 hunks)
cognee/modules/retrieval/utils/description_to_codepart_search.py (2 hunks)
cognee/modules/search/methods/search.py (4 hunks)
cognee/modules/search/operations/get_history.py (1 hunks)
cognee/modules/users/authentication/api_bearer/__init__.py (1 hunks)
cognee/modules/users/authentication/api_bearer/api_bearer_transport.py (1 hunks)
cognee/modules/users/authentication/api_bearer/api_jwt_strategy.py (1 hunks)
cognee/modules/users/authentication/default/__init__.py (1 hunks)
cognee/modules/users/authentication/default/default_jwt_strategy.py (1 hunks)
cognee/modules/users/authentication/default/default_transport.py (1 hunks)
cognee/modules/users/authentication/get_api_auth_backend.py (1 hunks)
cognee/modules/users/authentication/get_auth_backend.py (0 hunks)
cognee/modules/users/authentication/get_client_auth_backend.py (1 hunks)
cognee/modules/users/exceptions/__init__.py (1 hunks)
cognee/modules/users/exceptions/exceptions.py (1 hunks)
cognee/modules/users/get_fastapi_users.py (1 hunks)
cognee/modules/users/get_user_manager.py (2 hunks)
cognee/modules/users/methods/__init__.py (1 hunks)
cognee/modules/users/methods/get_authenticated_user.py (1 hunks)
cognee/modules/users/methods/get_default_user.py (2 hunks)
cognee/modules/users/methods/get_user.py (2 hunks)
cognee/modules/users/methods/get_user_by_email.py (1 hunks)
cognee/modules/users/models/ACL.py (1 hunks)
cognee/modules/users/models/DatasetDatabase.py (1 hunks)
cognee/modules/users/models/Tenant.py (1 hunks)
cognee/modules/users/models/User.py (2 hunks)
cognee/modules/users/models/__init__.py (1 hunks)

⛔ Files not processed due to max files limit (56)

cognee/modules/users/permissions/init.py
cognee/modules/users/permissions/methods/init.py
cognee/modules/users/permissions/methods/authorized_give_permission_on_datasets.py
cognee/modules/users/permissions/methods/check_permission_on_dataset.py
cognee/modules/users/permissions/methods/get_all_user_permission_datasets.py
cognee/modules/users/permissions/methods/get_document_ids_for_user.py
cognee/modules/users/permissions/methods/get_principal.py
cognee/modules/users/permissions/methods/get_principal_datasets.py
cognee/modules/users/permissions/methods/get_role.py
cognee/modules/users/permissions/methods/get_specific_user_permission_datasets.py
cognee/modules/users/permissions/methods/get_tenant.py
cognee/modules/users/permissions/methods/give_permission_on_dataset.py
cognee/modules/users/permissions/methods/give_permission_on_document.py
cognee/modules/users/permissions/permission_types.py
cognee/modules/users/roles/methods/add_user_to_role.py
cognee/modules/users/roles/methods/create_role.py
cognee/modules/users/tenants/methods/init.py
cognee/modules/users/tenants/methods/add_user_to_tenant.py
cognee/modules/users/tenants/methods/create_tenant.py
cognee/shared/logging_utils.py
cognee/tasks/documents/init.py
cognee/tasks/documents/check_permissions_on_dataset.py
cognee/tasks/documents/detect_language.py
cognee/tasks/documents/translate_text.py
cognee/tasks/ingestion/ingest_data.py
cognee/tasks/ingestion/resolve_data_directories.py
cognee/tests/test_cognee_server_start.py
cognee/tests/test_parallel_databases.py
cognee/tests/test_pgvector.py
cognee/tests/test_qdrant.py
cognee/tests/test_remote_kuzu.py
cognee/tests/test_remote_kuzu_stress.py
cognee/tests/test_starter_pipelines.py
cognee/tests/unit/modules/ontology/test_ontology_adapter.py
cognee/tests/unit/modules/search/search_methods_test.py
evals/comparative_eval/README.md
evals/comparative_eval/helpers/calculate_aggregate_metrics.py
evals/comparative_eval/helpers/convert_metrics.py
evals/comparative_eval/helpers/modal_evaluate_answers.py
evals/comparative_eval/qa_benchmark_base.py
evals/comparative_eval/qa_benchmark_graphiti.py
evals/comparative_eval/qa_benchmark_lightrag.py
evals/comparative_eval/qa_benchmark_mem0.py
examples/python/agentic_reasoning_procurement_example.py
examples/python/code_graph_example.py
examples/python/dynamic_steps_example.py
examples/python/graphiti_example.py
examples/python/multimedia_example.py
examples/python/ontology_demo_example.py
examples/python/ontology_demo_example_2.py
examples/python/simple_example.py
examples/python/simple_node_set_example.py
examples/relational_db_with_dlt/fix_foreign_keys.sql
examples/relational_db_with_dlt/relational_db_and_dlt.py
notebooks/cognee_demo.ipynb
notebooks/cognee_simple_demo.ipynb

💤 Files with no reviewable changes (34)

cognee-frontend/src/app/wizard/AddStep/AddStep.module.css
cognee-frontend/src/app/wizard/ExploreStep/index.ts
cognee-frontend/src/modules/ingestion/DatasetsView/index.ts
cognee-frontend/src/app/wizard/AddStep/index.ts
cognee/modules/data/models/Data.py
cognee/modules/ingestion/data_types/init.py
cognee/modules/data/methods/create_dataset.py
cognee-frontend/src/ui/Partials/Wizard/WizardHeading.tsx
cognee-frontend/src/app/auth/AuthPage.module.css
cognee-frontend/src/app/wizard/ConfigStep/index.ts
cognee-frontend/src/ui/Partials/Wizard/WizardContent/WizardContent.module.css
cognee-frontend/src/ui/Partials/Wizard/index.ts
cognee-frontend/src/app/wizard/ConfigStep/ConfigStep.tsx
cognee-frontend/src/modules/ingestion/DataView/DataView.module.css
cognee-frontend/src/modules/ingestion/DataView/RawDataPreview.module.css
cognee-frontend/src/ui/Partials/Footer/Footer.module.css
cognee-frontend/src/modules/ingestion/DatasetsView/StatusIcon.tsx
cognee-frontend/src/ui/Partials/Wizard/WizardContent/WizardContent.tsx
cognee-frontend/src/ui/Partials/Explorer/Explorer.tsx
cognee-frontend/src/app/wizard/page.tsx
cognee-frontend/src/app/wizard/WizardPage.module.css
cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.tsx
cognee-frontend/src/app/wizard/ExploreStep/ExploreStep.tsx
cognee-frontend/src/app/wizard/CognifyStep/index.ts
cognee-frontend/src/ui/Partials/Explorer/Explorer.module.css
cognee-frontend/src/app/wizard/CognifyStep/CognifyStep.tsx
cognee-frontend/src/modules/ingestion/DataView/index.ts
cognee-frontend/src/app/wizard/AddStep/AddStep.tsx
cognee-frontend/src/app/wizard/WizardPage.tsx
cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.module.css
cognee-frontend/src/modules/ingestion/DataView/RawDataPreview.tsx
cognee-frontend/src/modules/ingestion/DataView/DataView.tsx
cognee/modules/users/authentication/get_auth_backend.py
cognee/fetch_secret.py

🧰 Additional context used

🪛 Ruff (0.11.9)

cognee/api/v1/settings/routers/get_settings_router.py

50-50: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

cognee/modules/users/exceptions/__init__.py

12-12: .exceptions.PermissionNotFoundError imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/modules/users/authentication/default/__init__.py

1-1: .default_transport.default_transport imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

2-2: .default_jwt_strategy.DefaultJWTStrategy imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/modules/users/models/__init__.py

4-4: .DatasetDatabase.DatasetDatabase imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/infrastructure/databases/utils/__init__.py

1-1: .get_or_create_dataset_database.get_or_create_dataset_database imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/modules/data/exceptions/__init__.py

10-10: .exceptions.DatasetNotFoundError imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

11-11: .exceptions.DatasetTypeError imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/modules/pipelines/methods/__init__.py

1-1: .get_pipeline_run.get_pipeline_run imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/modules/graph/methods/__init__.py

1-1: .get_formatted_graph_data.get_formatted_graph_data imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/modules/users/authentication/api_bearer/__init__.py

1-1: .api_bearer_transport.api_bearer_transport imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

2-2: .api_jwt_strategy.APIJWTStrategy imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/modules/users/methods/__init__.py

5-5: .get_user_by_email.get_user_by_email imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/modules/pipelines/utils/__init__.py

1-1: .generate_pipeline_id.generate_pipeline_id imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

2-2: .generate_pipeline_run_id.generate_pipeline_run_id imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/modules/pipelines/models/__init__.py

3-3: .PipelineRunInfo.PipelineRunInfo imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

4-4: .PipelineRunInfo.PipelineRunStarted imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

5-5: .PipelineRunInfo.PipelineRunYield imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

6-6: .PipelineRunInfo.PipelineRunCompleted imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

7-7: .PipelineRunInfo.PipelineRunErrored imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee/modules/data/methods/__init__.py

11-11: .get_authorized_existing_datasets.get_authorized_existing_datasets imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

12-12: .get_dataset_ids.get_dataset_ids imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

15-15: .delete_dataset.delete_dataset imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

16-16: .delete_data.delete_data imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

19-19: .load_or_create_datasets.load_or_create_datasets imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

22-22: .check_dataset_name.check_dataset_name imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

cognee-mcp/src/test_client.py

123-124: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)

207-207: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

244-244: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

281-281: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

cognee/modules/users/methods/get_user.py

4-4: sqlalchemy.exc imported but unused

Remove unused import: sqlalchemy.exc

(F401)

cognee-mcp/src/server.py

85-85: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

cognee/infrastructure/databases/vector/pgvector/create_db_and_tables.py

3-3: cognee.context_global_variables.vector_db_config imported but unused

Remove unused import: cognee.context_global_variables.vector_db_config

(F401)

cognee/modules/users/authentication/default/default_jwt_strategy.py

1-1: jwt imported but unused

Remove unused import: jwt

(F401)

2-2: uuid.UUID imported but unused

Remove unused import: uuid.UUID

(F401)

3-3: fastapi_users.jwt.generate_jwt imported but unused

Remove unused import: fastapi_users.jwt.generate_jwt

(F401)

6-6: cognee.modules.users.models.User imported but unused

Remove unused import: cognee.modules.users.models.User

(F401)

7-7: cognee.modules.users.get_user_manager.UserManager imported but unused

Remove unused import: cognee.modules.users.get_user_manager.UserManager

(F401)

cognee-starter-kit/src/pipelines/low_level.py

38-38: Use a context manager for opening files

(SIM115)

41-41: Use a context manager for opening files

(SIM115)

cognee/context_global_variables.py

35-35: Use os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() != "true" instead of not os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() == "true"

Replace with != operator

(SIM201)

cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py

5-5: typing.Dict imported but unused

Remove unused import

(F401)

5-5: typing.Any imported but unused

Remove unused import

(F401)

cognee/api/v1/add/routers/get_add_router.py

24-24: Do not perform function call Form in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

cognee/modules/ingestion/classify.py

20-20: Multiple isinstance calls for data, merge into a single call

Merge isinstance calls for data

(SIM101)

cognee/api/v1/search/routers/get_search_router.py

34-34: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

cognee/api/v1/search/search.py

24-24: Multiple isinstance calls for datasets, merge into a single call

Merge isinstance calls for datasets

(SIM101)

cognee/api/v1/permissions/routers/get_permissions_router.py

19-19: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

35-35: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

44-44: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

54-54: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

63-63: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

cognee/modules/pipelines/operations/pipeline.py

76-76: Multiple isinstance calls for datasets, merge into a single call

Merge isinstance calls for datasets

(SIM101)

cognee/api/v1/cognify/cognify.py

100-100: Undefined name anext. Consider specifying requires-python = ">= 3.10" or tool.ruff.target-version = "py310" in your pyproject.toml file.

(F821)

105-105: Undefined name anext. Consider specifying requires-python = ">= 3.10" or tool.ruff.target-version = "py310" in your pyproject.toml file.

(F821)

cognee/api/v1/cognify/routers/get_cognify_router.py

43-43: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

76-77: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)

cognee/api/v1/datasets/routers/get_datasets_router.py

75-75: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

89-89: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

154-154: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

🪛 dotenv-linter (3.3.0)

cognee-starter-kit/.env.template

[warning] 2-2: [QuoteCharacter] The value has quote characters (', ")

[warning] 3-3: [QuoteCharacter] The value has quote characters (', ")

[warning] 4-4: [QuoteCharacter] The value has quote characters (', ")

[warning] 6-6: [QuoteCharacter] The value has quote characters (', ")

[warning] 6-6: [UnorderedKey] The LLM_ENDPOINT key should go before the LLM_MODEL key

[warning] 7-7: [QuoteCharacter] The value has quote characters (', ")

[warning] 7-7: [UnorderedKey] The LLM_API_VERSION key should go before the LLM_ENDPOINT key

[warning] 10-10: [QuoteCharacter] The value has quote characters (', ")

[warning] 11-11: [QuoteCharacter] The value has quote characters (', ")

[warning] 12-12: [QuoteCharacter] The value has quote characters (', ")

[warning] 14-14: [QuoteCharacter] The value has quote characters (', ")

[warning] 14-14: [UnorderedKey] The EMBEDDING_ENDPOINT key should go before the EMBEDDING_MODEL key

[warning] 15-15: [QuoteCharacter] The value has quote characters (', ")

[warning] 15-15: [UnorderedKey] The EMBEDDING_API_VERSION key should go before the EMBEDDING_ENDPOINT key

[warning] 17-17: [ExtraBlankLine] Extra blank line detected

[warning] 18-18: [QuoteCharacter] The value has quote characters (', ")

[warning] 19-19: [EndingBlankLine] No blank line at the end of the file

[warning] 19-19: [QuoteCharacter] The value has quote characters (', ")

[warning] 19-19: [UnorderedKey] The GRAPHISTRY_PASSWORD key should go before the GRAPHISTRY_USERNAME key

🪛 Pylint (3.3.7)

cognee/modules/users/authentication/api_bearer/api_jwt_strategy.py

[refactor] 4-4: Too few public methods (0/2)

(R0903)

cognee-starter-kit/src/pipelines/custom-model.py

[refactor] 33-33: Too few public methods (0/2)

(R0903)

[refactor] 36-36: Too few public methods (0/2)

(R0903)

[refactor] 41-41: Too few public methods (0/2)

(R0903)

[refactor] 44-44: Too few public methods (0/2)

(R0903)

cognee-mcp/src/test_client.py

[refactor] 201-204: Unnecessary "elif" after "break", remove the leading "el" from "elif"

(R1723)

[refactor] 238-241: Unnecessary "elif" after "break", remove the leading "el" from "elif"

(R1723)

[refactor] 275-278: Unnecessary "elif" after "break", remove the leading "el" from "elif"

(R1723)

cognee/modules/data/methods/load_or_create_datasets.py

[refactor] 23-23: Consider merging these comparisons with 'in' by using 'identifier in (ds.name, ds.id)'. Use a set instead if elements are hashable.

(R1714)

cognee/modules/users/models/DatasetDatabase.py

[refactor] 7-7: Too few public methods (0/2)

(R0903)

cognee/modules/users/authentication/default/default_jwt_strategy.py

[refactor] 10-10: Too few public methods (0/2)

(R0903)

cognee-starter-kit/src/pipelines/low_level.py

[refactor] 14-14: Too few public methods (0/2)

(R0903)

[refactor] 19-19: Too few public methods (0/2)

(R0903)

[refactor] 25-25: Too few public methods (0/2)

(R0903)

[refactor] 29-29: Too few public methods (0/2)

(R0903)

[refactor] 38-38: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 41-41: Consider using 'with' for resource-allocating operations

(R1732)

cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py

[refactor] 84-114: Too many nested blocks (6/5)

(R1702)

cognee/infrastructure/databases/graph/neo4j_driver/adapter.py

[refactor] 544-546: Consider using '{"node_id": node_id}' instead of a call to 'dict'.

(R1735)

cognee/modules/ingestion/classify.py

[refactor] 20-20: Consider merging these isinstance calls to isinstance(data, (BufferedReader, SpooledTemporaryFile))

(R1701)

cognee/api/v1/search/routers/get_search_router.py

[refactor] 16-16: Too few public methods (0/2)

(R0903)

cognee/modules/pipelines/models/PipelineRunInfo.py

[refactor] 6-6: Too few public methods (0/2)

(R0903)

[refactor] 16-16: Too few public methods (0/2)

(R0903)

[refactor] 21-21: Too few public methods (0/2)

(R0903)

[refactor] 26-26: Too few public methods (0/2)

(R0903)

[refactor] 31-31: Too few public methods (0/2)

(R0903)

cognee/modules/search/methods/search.py

[refactor] 33-33: Too many arguments (8/5)

(R0913)

[refactor] 33-33: Too many positional arguments (8/5)

(R0917)

[refactor] 146-146: Too many arguments (6/5)

(R0913)

[refactor] 146-146: Too many positional arguments (6/5)

(R0917)

[refactor] 174-174: Too many arguments (6/5)

(R0913)

[refactor] 174-174: Too many positional arguments (6/5)

(R0917)

[refactor] 187-187: Too many arguments (6/5)

(R0913)

[refactor] 187-187: Too many positional arguments (6/5)

(R0917)

cognee/api/v1/search/search.py

[refactor] 12-12: Too many arguments (9/5)

(R0913)

[refactor] 12-12: Too many positional arguments (9/5)

(R0917)

[refactor] 24-24: Consider merging these isinstance calls to isinstance(datasets, (UUID, str))

(R1701)

cognee/infrastructure/databases/vector/qdrant/QDrantAdapter.py

[refactor] 156-159: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)

[error] 413-413: Possibly using variable 'collection_size' before assignment

(E0606)

cognee/api/v1/permissions/routers/get_permissions_router.py

[refactor] 25-25: Unnecessary use of a comprehension, use list(dataset_ids) instead.

(R1721)

cognee/modules/pipelines/operations/pipeline.py

[refactor] 76-76: Consider merging these isinstance calls to isinstance(datasets, (UUID, str))

(R1701)

cognee/api/v1/cognify/cognify.py

[refactor] 44-59: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

cognee/api/v1/cognify/routers/get_cognify_router.py

[refactor] 32-32: Too few public methods (0/2)

(R0903)

cognee/api/v1/datasets/routers/get_datasets_router.py

[refactor] 50-50: Too few public methods (0/2)

(R0903)

[refactor] 56-56: Too few public methods (0/2)

(R0903)

[refactor] 62-62: Too few public methods (0/2)

(R0903)

[refactor] 67-67: Too few public methods (0/2)

(R0903)

[refactor] 71-71: Too many statements (61/50)

(R0915)

cognee/modules/ontology/rdf_xml/OntologyResolver.py

[refactor] 17-17: Too few public methods (1/2)

(R0903)

[refactor] 126-126: Too many local variables (24/15)

(R0914)

[refactor] 126-126: Too many branches (15/12)

(R0912)

[refactor] 126-126: Too many statements (51/50)

(R0915)

🪛 Biome (1.9.4)

cognee-frontend/src/app/globals.css

[error] 30-30: This @import is in the wrong position.

Any @import rules must precede all other valid at-rules and style rules in a stylesheet (ignoring @charset and @layer), or else the @import rule is invalid.
Consider moving import position.

(lint/correctness/noInvalidPositionAtImportRule)

🪛 LanguageTool

cognee-starter-kit/README.md

[uncategorized] ~90-~90: You might be missing the article “an” here.
Context: ...D="" ``` Note: GRAPHISTRY_PASSWORD is API key. # What will you build with cogne...

(AI_EN_LECTOR_MISSING_DETERMINER_AN)

cognee-mcp/README.md

[style] ~165-~165: Consider a more concise word here.
Context: ...cp dev src/server.py` ### Development In order to use local cognee: 1. Uncomment the fol...

(IN_ORDER_TO_PREMIUM)

🪛 markdownlint-cli2 (0.17.2)

cognee-starter-kit/README.md

17-17: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

21-21: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

28-28: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

43-43: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

51-51: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

59-59: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

67-67: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

75-75: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

84-84: Bare URL used
null

(MD034, no-bare-urls)

86-86: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

cognee-mcp/README.md

52-52: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

56-56: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

60-60: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

64-64: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

68-68: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

72-72: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

76-76: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

80-80: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

109-109: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

125-125: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

169-169: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

175-175: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms (4)

GitHub Check: End-to-End Tests / S3 Bucket Test
GitHub Check: End-to-End Tests / Server Start Test
GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
GitHub Check: Basic Tests / Run Unit Tests

🔇 Additional comments (217)

cognee/modules/data/processing/document_types/open_data_file.py (1)

11-12: LGTM! Good lazy import optimization.

Moving the s3fs import inside the conditional block is an excellent optimization. This ensures the module is only imported when S3 credentials are available and S3 functionality is actually needed, reducing startup time and avoiding import errors when s3fs isn't installed.

cognee-starter-kit/.env.template (1)

1-19: LGTM! Well-structured environment template.

The environment template provides clear configuration options for LLM and embedding providers with sensible OpenAI defaults. The grouping of related variables (API_KEY, MODEL, PROVIDER followed by optional ENDPOINT/VERSION) is more intuitive than alphabetical ordering suggested by the linter.

cognee-starter-kit/.gitignore (1)

1-196: LGTM! Comprehensive and well-organized gitignore.

This gitignore file provides excellent coverage for a Python project with potential Node.js components. The thoughtful handling of the logs directory (preserving structure while ignoring contents) and inclusion of project-specific directories like .cognee_system/ and .data_storage/ shows good attention to the project's specific needs.

cognee-frontend/.prettierignore (1)

1-3: LGTM! Standard and appropriate Prettier exclusions.

The exclusions for node_modules, dist, and coverage are exactly what's needed for a frontend project to avoid formatting generated files and dependencies.

cognee-frontend/src/app/auth/page.tsx (1)

1-1: LGTM! Stylistic improvement for consistency.

The change from single to double quotes aligns with code formatting standards and likely supports the new Prettier configuration added to the project.

README.md (1)

42-48: LGTM! Clean formatting improvement.

Removing trailing spaces after pipe characters improves the consistency and cleanliness of the markdown formatting in the language links section.

cognee-frontend/src/modules/chat/api/getHistory.ts (2)

1-1: LGTM! Consistent quote style.

The change from single to double quotes in the import statement improves code style consistency.

5-5: LGTM! Consistent quote style.

The change from single to double quotes for the API endpoint string maintains consistency with the import statement above.

cognee/infrastructure/databases/utils/__init__.py (1)

1-1: LGTM! Proper package-level function exposure.

The import correctly exposes get_or_create_dataset_database at the package level, enabling easier access from other modules. The static analysis warning about unused import is a false positive since this is a deliberate API exposure pattern.

cognee/modules/users/exceptions/__init__.py (1)

12-12: LGTM! Consistent exception exposure pattern.

Adding PermissionNotFoundError to the imported exceptions maintains consistency with other user-related exceptions exposed by this package. The static analysis warning is a false positive since this follows the established pattern for exception API exposure.

cognee/modules/graph/methods/__init__.py (1)

1-1: LGTM! Valid re-export in init.py file.

The static analysis warning about unused import is a false positive. This import serves as a re-export in the package's __init__.py file, making get_formatted_graph_data available at the package level for use in API endpoints (as mentioned in the AI summary).

cognee-frontend/src/utils/index.ts (1)

1-3: LGTM! Good standardization and utility addition.

The changes improve code consistency by standardizing on double quotes and add the new useBoolean utility hook to the exports. This aligns with the frontend modernization mentioned in the AI summary.

cognee/api/v1/delete/routers/get_delete_router.py (1)

17-17: LGTM! Consistent route path standardization.

Removing the trailing slash from the route decorator improves consistency across API routers, as mentioned in the AI summary. This helps avoid potential routing ambiguities between / and `` paths.

cognee/exceptions/exceptions.py (1)

34-36: LGTM! Excellent addition of string representation for exceptions.

Adding the __str__ method to the base exception class follows Python best practices and provides consistent, human-readable error messages. The format matches the logging output in the __init__ method, ensuring consistency across the codebase.

cognee-frontend/Dockerfile (4)

2-2: LGTM! Good Node.js version upgrade.

Upgrading from Node.js 18 to 22 brings performance improvements, security updates, and access to newer JavaScript features. This aligns with the frontend modernization efforts.

12-12: LGTM! Reasonable to disable problematic build step.

Commenting out the npm rebuild lightningcss command suggests it may have been causing build issues or is no longer necessary with the updated dependencies and Node.js version.

18-18: LGTM! PostCSS configuration for Tailwind CSS support.

Adding the PostCSS configuration file copy aligns with the Tailwind CSS integration mentioned in the AI summary, supporting the new styling approach in the frontend.

22-22: LGTM! Improved CMD instruction format.

Using the exec form ["npm", "run", "dev"] instead of shell form is a Docker best practice. It ensures proper signal handling and process management within the container.

cognee-frontend/src/app/auth/signup/page.tsx (1)

1-1: LGTM! Standard Next.js routing pattern.

This follows the established Next.js convention for page routing where page.tsx serves as the entry point and delegates to the actual page component.
cognee/api/v1/cognify/code_graph_pipeline.py (3)

5-5: Good addition for consistent logging setup.

Adding setup_logging to the imports aligns with the logging standardization across the codebase.

100-100: Proper logging setup in main execution block.

Good practice to set up logging before running the main async function.

46-46: Verify the significant batch size reduction.

The batch size was reduced from 500 to 30 (94% reduction), which is a substantial change that could significantly impact pipeline performance. Please ensure this change is intentional and addresses a specific issue.
#!/bin/bash
# Description: Check for any comments or documentation about batch size changes
# Expected: Find context about why batch size was reduced

# Search for any TODOs, comments, or documentation about batch size
rg -i "batch.?size|memory|timeout|performance" --type py -A 3 -B 3

# Check if there are any related issues or configurations
rg "add_data_points.*batch" --type py -A 2 -B 2
cognee-frontend/.prettierrc (1)

1-7: Well-configured Prettier setup with Tailwind CSS support.

The configuration includes sensible defaults and the prettier-plugin-tailwindcss plugin for automatic Tailwind class sorting, which will help maintain consistent code formatting across the frontend.

cognee/api/v1/users/routers/get_visualize_router.py (1)

11-11: Route path standardization looks good.

Removing the explicit trailing slash from the route decorator aligns with the broader API routing consistency improvements mentioned in the PR.

cognee-frontend/src/app/auth/login/page.tsx (1)

1-1: LGTM! Consistent with Next.js routing conventions.

This mirrors the same pattern used in the signup page and follows standard Next.js routing practices.

cognee/modules/search/operations/get_history.py (1)

19-22: LGTM! Good query optimization.

The changes improve query construction by:

Moving order_by before limit ensures proper ordering of the union results

Adding the conditional check prevents unnecessary LIMIT 0 clauses when limit is not positive

Improves overall query performance and correctness

cognee/modules/users/authentication/api_bearer/__init__.py (1)

1-2: LGTM! Standard package re-export pattern.

This follows the standard Python pattern for making submodule components accessible at the package level. The static analysis warnings about unused imports are false positives - these imports are intentionally re-exported for convenient package-level access.

cognee-frontend/src/modules/ingestion/addData.ts (1)

1-19: LGTM! Good code standardization.

The changes standardize string literals to use double quotes, improving code consistency across the frontend codebase. No functional changes were made - the ingestion logic remains correct.

cognee/modules/pipelines/utils/__init__.py (1)

1-2: LGTM! Clean utility re-exports.

This follows standard Python package patterns for re-exporting utility functions at the package level. The static analysis warnings about unused imports are false positives - these functions are intended for package-level access and are used elsewhere in the pipeline system.

cognee/api/v1/settings/routers/get_settings_router.py (2)

49-49: LGTM: Route path consistency improvement.

Removing trailing slashes from route decorators improves consistency across the API endpoints.

Also applies to: 55-55

50-50: Static analysis false positive - FastAPI dependency injection pattern is correct.

The static analysis tool flagged the Depends(get_authenticated_user) usage in the function parameter default, but this is the standard and correct FastAPI pattern for dependency injection. The Depends() call needs to be in the parameter default to work properly with FastAPI's dependency injection system.

cognee-frontend/src/ui/App/Loading/DefaultLoadingIndicator/LoadingIndicator.module.css (1)

3-4: Good improvement: Using relative units for better responsiveness.

Changing from fixed 16px to 1rem units makes the loading indicator more responsive and accessible, scaling with user font size preferences.
cognee/modules/users/methods/__init__.py (1)

5-5: ```shell
#!/bin/bash

Check if all is defined in the re-export init.py

rg "all" -n cognee/modules/users/methods/init.py
</details>
<details>
<summary>cognee-frontend/src/ui/Icons/GitHubIcon.tsx (1)</summary>

`6-6`: **Excellent improvement: Dynamic color support enhances reusability.**

Changing from hardcoded `"#fdfdfd"` to the dynamic `{color}` prop makes the GitHubIcon component much more flexible and reusable across different contexts and themes. This aligns with good React component design principles.

</details>
<details>
<summary>cognee-frontend/src/app/layout.tsx (1)</summary>

`8-8`: **Good branding update.**

Updating the title from the default "Create Next App" to "Cognee" properly brands the application.

</details>
<details>
<summary>cognee-frontend/postcss.config.mjs (1)</summary>

`1-5`: **Verify the Tailwind CSS PostCSS plugin name.**

The plugin name `@tailwindcss/postcss` appears unusual. Typically, the Tailwind CSS PostCSS plugin is referenced as `tailwindcss`.



Please verify that this plugin name is correct. If it should be the standard Tailwind CSS plugin, apply this diff:

```diff
-    "@tailwindcss/postcss": {},
+    "tailwindcss": {},
What is the correct PostCSS plugin name for Tailwind CSS in 2024?
cognee/modules/pipelines/operations/log_pipeline_run_start.py (2)

7-7: LGTM! Good addition of the utility import.

The import for generate_pipeline_run_id aligns with the deterministic ID generation approach mentioned in the PR summary.

18-18: LGTM! Improved pipeline run ID generation.

The change from uuid4() to generate_pipeline_run_id(pipeline_id, dataset_id) provides deterministic and consistent pipeline run identification, which improves traceability and aligns with the broader pipeline refactoring efforts.

cognee/modules/retrieval/utils/brute_force_triplet_search.py (1)

128-129: LGTM! Good documentation improvement.

The added parameter descriptions for node_type and node_name improve the function's documentation clarity and align with the enhanced filtering capabilities mentioned in the PR summary.

cognee/eval_framework/analysis/metrics_calculator.py (1)

41-53: LGTM! Excellent defensive programming improvement.

The None check prevents failed evaluations from corrupting the metrics data and details. This enhances the robustness of the evaluation framework and aligns well with the retry logic improvements mentioned in the PR summary.

cognee/infrastructure/databases/vector/lancedb/LanceDBAdapter.py (1)

234-236: Good defensive programming practice.

This safeguard correctly handles the edge case where a collection is empty, preventing LanceDB from failing when limit is 0. The early return with an empty list is both efficient and prevents runtime errors.

cognee/api/v1/visualize/visualize.py (2)

5-5: Logging import updated for consistency.

Good addition of setup_logging to the import statement to support the improved logging configuration.

31-31: Improved logging initialization.

The switch from get_logger(level=ERROR) to setup_logging(log_level=ERROR) aligns with the broader logging standardization effort across the codebase and provides more comprehensive logging configuration.

cognee-frontend/src/app/globals.css (1)

20-20: Good addition for full-height layout.

Adding height: 100% to the html and body selectors supports full-height layouts, which is useful for modern web applications.

cognee/modules/users/authentication/api_bearer/api_bearer_transport.py (1)

1-8: LGTM! Clean implementation of API bearer transport.

The bearer transport configuration follows FastAPI Users patterns correctly. The token URL endpoint and transport naming are appropriate for the API authentication flow.

cognee/infrastructure/databases/vector/get_vector_engine.py (2)

1-1: LGTM! Proper migration to context-aware configuration.

The import change to get_vectordb_context_config aligns with the new context-aware database configuration system for multi-tenant support.

6-7: LGTM! Clean implementation with helpful documentation.

The function simplification and explanatory comment clearly communicate the context-based configuration selection. The direct parameter unpacking is efficient and appropriate.

cognee/modules/data/models/Dataset.py (1)

22-23: LGTM! Proper ACL relationship implementation.

The bidirectional relationship with cascade delete ensures proper cleanup of ACL records when datasets are removed. This correctly implements the dataset-scoped permission system.

cognee/modules/users/authentication/api_bearer/api_jwt_strategy.py (1)

4-5: Consider future extensibility for API-specific JWT logic.

The current implementation is a simple identity subclass of JWTStrategy. While this serves as a type identifier for the API authentication backend, consider whether API-specific JWT handling (like different token lifetimes, claims, or validation rules) might be needed in the future.

Note: The Pylint warning about too few public methods is a false positive in this context, as this class serves an architectural purpose in the authentication system.

cognee/infrastructure/databases/graph/config.py (1)

110-118: LGTM! Well-implemented context-aware configuration.

The function correctly implements the pattern for multi-tenant database configurations. The fallback to get_graph_config().to_hashable_dict() ensures backward compatibility, and the documentation clearly explains the purpose and benefits.

cognee/modules/pipelines/models/__init__.py (1)

2-8: LGTM! Static analysis warnings are false positives.

The imports correctly expose the new pipeline run info classes as part of the module's public API. The static analysis warnings about unused imports are false positives - these classes are intended to be imported by consumers of this module.

cognee/api/v1/users/routers/get_auth_router.py (2)

2-2: LGTM! Authentication backend update aligns with the refactor.

The change from get_auth_backend to get_client_auth_backend is consistent with the broader authentication refactor that introduces separate backends for API and client authentication.

6-6: Function call correctly updated to match the import.

The function call is properly updated to use the new client authentication backend.

cognee/modules/users/methods/get_user.py (2)

3-3: Good improvement: selectinload is more efficient for multiple relationships.

The change from joinedload to selectinload is a good optimization when loading multiple relationships (roles and tenant), as it avoids potential cartesian products.

22-24: Excellent error handling improvement.

Adding explicit error handling with EntityNotFoundError is much better than returning None. This provides clearer error semantics and better debugging information.

cognee-frontend/src/utils/useBoolean.ts (1)

1-14: LGTM! Well-implemented React hook following best practices.

The useBoolean hook is cleanly implemented with proper use of useState and useCallback. The memoized setter functions with empty dependency arrays are correctly implemented to prevent unnecessary re-renders.

cognee-frontend/src/ui/Partials/SettingsModal/Settings.tsx (1)

1-201: Clarify if this commenting out is temporary or permanent.

The entire Settings component has been commented out, removing all configuration UI functionality. While this aligns with the backend changes introducing ENABLE_BACKEND_ACCESS_CONTROL, consider:

If this is a permanent removal, delete the commented code to improve maintainability

If this is temporary, add a comment explaining the reason and timeline for restoration

Is this commenting out temporary or should the code be completely removed?

cognee-frontend/eslint.config.mjs (1)

1-16: Well-structured ESLint configuration following modern best practices.

The configuration properly handles ES modules, integrates with Next.js and TypeScript, and includes Prettier for consistent formatting. The setup is clean and follows current ESLint standards.

cognee/modules/retrieval/exceptions/exceptions.py (1)

29-36: Well-implemented exception class following consistent patterns.

The new CollectionDistancesNotFoundError exception properly inherits from CogneeApiError, uses an appropriate 404 status code, and provides a clear error message. The implementation is consistent with other exception classes in the module.

cognee/modules/users/exceptions/exceptions.py (1)

51-58: Exception implementation follows consistent patterns.

The PermissionNotFoundError exception is well-implemented and consistent with other exception classes. The 403 status code is appropriate for permission-related errors, though you might consider whether 404 would be more semantically correct for "not found" scenarios versus "access denied" scenarios.

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1)

456-457: Good defensive programming practice.

Ensuring the directory exists before truncating the SQLite database file prevents potential file operation failures. This change improves the robustness of the database deletion process.

cognee/modules/pipelines/utils/generate_pipeline_id.py (1)

4-5: LGTM! Clean deterministic UUID generation.

The function correctly generates a deterministic UUID5 based on user ID and pipeline name, which enables consistent pipeline identification across the system.

cognee/modules/users/get_fastapi_users.py (1)

14-19: Good enhancement for flexible authentication.

The dual-backend authentication setup enables more flexible authentication strategies. The implementation correctly instantiates both backends and passes them as a list to FastAPIUsers.

cognee-frontend/src/modules/auth/auth0.ts (1)

1-8: LGTM! Clean Auth0 configuration.

The Auth0Client setup follows standard practices with appropriate OAuth scopes and API audience configuration.

cognee/modules/retrieval/graph_completion_retriever.py (3)

11-13: Good addition of logging capabilities.

Adding structured logging will help with debugging and monitoring retrieval operations.

56-56: Improved node content fallback logic.

The change to use "description" as the primary fallback before "name" provides more meaningful content when "text" is unavailable.

139-139: Helpful warning log for empty results.

This warning will aid in debugging scenarios where queries return no relevant triplets.

cognee-frontend/src/ui/Icons/index.ts (1)

1-4: Clean addition of new icon exports.

The new icon exports follow the established pattern and maintain consistency with existing exports.

cognee-frontend/src/ui/elements/Input.tsx (1)

1-8: Well-implemented reusable Input component.

The component properly uses TypeScript typing, classNames for CSS merging, and follows React best practices with prop spreading. The Tailwind styling provides consistent appearance.
cognee/infrastructure/databases/vector/config.py (1)

67-73: Good implementation of context-aware configuration.

The function properly supports context-specific vector database configurations with appropriate fallback to default settings. The design enables multi-tenant or dataset-scoped database access.

Verify that the context variable pattern is used consistently across the codebase:
#!/bin/bash
# Description: Check usage of vector_db_config context variable across the codebase
# Expected: Consistent usage pattern and proper context management

echo "=== Searching for vector_db_config usage ==="
rg -A 3 -B 3 "vector_db_config"

echo -e "\n=== Searching for get_vectordb_context_config usage ==="
rg -A 3 -B 3 "get_vectordb_context_config"
cognee-frontend/src/app/auth/signup/SignUpPage.tsx (1)

1-31: Good implementation with room for minor improvements

The component follows React and Next.js conventions well. The structure is clean and the use of the shared AuthForm component promotes reusability.
cognee-frontend/src/modules/ingestion/useDatasets.ts (3)

41-41: Verify the polling interval increase is intentional

The polling interval was increased from 5 seconds to 50 seconds (10x increase). This significant change could impact user experience by making status updates appear slower.

Please confirm this 10x increase in polling interval is intentional and aligns with backend performance requirements or rate limiting constraints.

18-30: Good error handling and async patterns

The fetchDatasetStatuses function properly handles async operations and maintains clean state management.

72-82: ```shell
#!/bin/bash

Inspect how FastAPI Users is configured to determine the auth backend and credential method

sed -n '1,200p' cognee/modules/users/methods/get_fastapi_users.py
</details>
<details>
<summary>cognee/infrastructure/databases/graph/kuzu/adapter.py (4)</summary>

`49-54`: **Improved robustness for database initialization**

The enhanced error handling for `FileExistsError` during database directory creation is a good improvement that handles edge cases where a file exists at the database path.

---

`1052-1052`: **Good generalization of edge query**

Changing from `MATCH (n:Node)-[r:EDGE]->(m:Node)` to `MATCH (n:Node)-[r]->(m:Node)` makes the query more flexible by matching any relationship type, not just those labeled `EDGE`.

---

`46-83`: **Robust database initialization with proper error handling**

The initialization method properly handles connection setup and schema creation with good error handling and logging.

---

`1164-1167`: **Fix logical issue with edge appending**

The edge appending logic has an indentation issue that could cause edges to not be added when JSON parsing fails.



The current code only appends edges when JSON parsing succeeds, but edges should be appended regardless:

```diff
            if props:
                try:
                    data = json.loads(props)
                except json.JSONDecodeError:
                    logger.warning(f"Failed to parse JSON props for edge {from_id}->{to_id}")
-
            edges.append((from_id, to_id, rel_type, data))
The edge should always be appended, whether JSON parsing succeeds or fails.

Likely an incorrect or invalid review comment.
cognee/modules/users/methods/get_user_by_email.py (1)

9-21: Well-structured async database query function

The function properly uses async patterns with SQLAlchemy, includes appropriate eager loading for related entities, and follows good practices for database access.

cognee-frontend/src/ui/elements/index.ts (1)

1-8: Well-organized barrel export file

The index file follows good practices for organizing UI component exports with consistent naming and structure.

cognee-frontend/src/ui/Partials/index.ts (2)

2-2: Confirm the deprecation strategy for commented exports.

The exports for SettingsModal and Explorer components are commented out rather than completely removed. This suggests they may be temporarily disabled or deprecated.

Please confirm whether these components should be permanently removed or if they're temporarily disabled for future re-enablement.

Also applies to: 5-5

6-6: LGTM: New FeedbackForm export added.

The addition of the FeedbackForm export enhances the UI component library with user feedback functionality.

cognee-frontend/src/app/(graph)/getColorForNodeType.ts (1)

4-18: LGTM: Well-structured color mapping system.

The node color mapping is well-organized and uses consistent Tailwind CSS colors. The variety of colors provides good visual distinction for different node types.

cognee/eval_framework/corpus_builder/task_getters/get_cascade_graph_tasks.py (1)

11-11: LGTM: Clean refactor to dataset-level permissions.

The changes correctly update the permission checking from document-level to dataset-level, which aligns with the broader permission system refactor mentioned in the PR summary.

Also applies to: 34-34
cognee/modules/graph/utils/expand_with_nodes_and_edges.py (1)

78-78: Verify the category mapping logic.

The replacement of isinstance checks with string comparisons may have incorrect mappings. Based on typical ontology patterns, Thing (individuals/instances) should map to "individuals" and ThingClass (classes/types) should map to "classes".

Please verify that the category mapping is correct by checking the AttachedOntologyNode implementation:
#!/bin/bash
# Description: Verify the category mapping in AttachedOntologyNode
# Expected: Find how Thing and ThingClass map to category values

ast-grep --pattern 'class AttachedOntologyNode {
  $$$
}'

rg -A 10 -B 5 "category.*classes|category.*individuals" 
Also applies to: 90-90, 159-159, 171-171
cognee/modules/users/models/User.py (2)

13-13: LGTM: Clean integration with FastAPI Users.

The multiple inheritance from SQLAlchemyBaseUserTableUUID and Principal properly integrates the User model with FastAPI Users while maintaining existing functionality.

50-50: Verify the default verification status.

Setting is_verified: bool = True by default means new users are automatically verified. Confirm this aligns with your authentication requirements.

Consider if auto-verification is the intended behavior for your use case, or if it should default to False requiring explicit verification.

cognee/api/client.py (2)

71-71: Excellent security improvement!

Restricting CORS origins from wildcard "*" to specific localhost URLs significantly improves security by preventing unauthorized cross-origin requests from malicious websites.

199-199: Good logging setup enhancement.

Adding setup_logging() call in the main entry point ensures consistent logging configuration across the application, which aligns with the broader logging improvements mentioned in the AI summary.

cognee/modules/users/methods/get_default_user.py (1)

37-39: Excellent exception handling refinement!

The specific handling of NoResultFound exceptions with detailed error messages is a significant improvement over generic exception catching. This provides better error context and maintains proper exception chaining.

cognee-starter-kit/src/pipelines/default.py (1)

8-72: Excellent pipeline demonstration script!

The code is well-structured with proper async patterns, clear directory configuration, and comprehensive demonstration of Cognee features. The use of pathlib for cross-platform path handling and the logical flow from data ingestion to various search types makes this an excellent example for users.

cognee/modules/pipelines/operations/log_pipeline_run_initiated.py (2)

4-4: LGTM: Deterministic pipeline run ID generation.

The change from uuid4() to generate_pipeline_run_id(pipeline_id, dataset_id) provides consistent, deterministic pipeline run identification which aligns with the broader system improvements for dataset-scoped operations.

9-9: LGTM: Consistent ID generation usage.

The implementation correctly uses the new deterministic ID generation utility function.

cognee-frontend/src/ui/Partials/SearchView/SearchView.module.css (1)

1-7: LGTM: Clean message spacing implementation.

The sibling combinator CSS rules provide appropriate vertical spacing between adjacent user and system messages. The implementation is clean and minimal.

cognee/modules/data/methods/get_unique_dataset_id.py (1)

6-9: LGTM: Flexible dataset ID handling with proper type safety.

The function enhancement to accept both string and UUID inputs is well-implemented:

Proper type annotation with Union[str, UUID]

Early return optimization for UUID inputs

Maintains backward compatibility for string dataset names

Uses isinstance() for reliable type checking

This improvement aligns with the system-wide dataset UUID handling enhancements.

cognee/eval_framework/corpus_builder/corpus_builder_executor.py (1)

64-67: LGTM: Proper implementation of streaming pipeline execution.

The change from direct await cognee_pipeline(tasks=tasks) to async iteration correctly implements the new streaming pipeline run model:

Assigns pipeline to variable without immediate await

Uses async for to iterate over yielded run_info events

Enables processing of intermediate pipeline results

This aligns with the broader architectural changes for streaming pipeline run results and background processing.

cognee/infrastructure/databases/graph/kuzu/show_remote_kuzu_stats.py (1)

28-29: Good use of proper resource cleanup.

The finally block ensures the adapter connection is properly closed even if queries fail.
cognee/infrastructure/databases/vector/create_vector_engine.py (3)

46-46: Good fix for critical typo.

The correction from utl=vector_db_url to url=vector_db_url fixes a runtime error that would occur when using supported databases.

11-12: Appropriate default values for optional parameters.

Using empty strings as defaults for vector_db_port and vector_db_key makes sense since the function handles missing credentials with proper error messages later.

10-12: Verify impact of parameter reordering.

The parameter order change (moving vector_db_url to second position and adding defaults) is a breaking change for existing callers using positional arguments.

Run this script to check for existing usages that might be affected:
#!/bin/bash
# Search for calls to create_vector_engine to assess breaking change impact
rg -A 3 "create_vector_engine\(" --type py
cognee-frontend/src/middleware.ts (2)

5-17: Middleware is currently a pass-through placeholder.

The middleware allows all requests without any processing. The commented Auth0 code suggests this is intentional preparation for future authentication.

Consider adding a TODO comment to clarify the intended timeline for enabling the Auth0 integration.

19-29: Well-configured matcher for Next.js static files.

The matcher properly excludes Next.js static assets and common metadata files while applying to all other routes.

cognee-mcp/src/server.py (4)

7-7: LGTM: Improved logging import structure.

The addition of setup_logging to the imports enhances the logging configuration capabilities.

77-77: LGTM: Corrected parameter name.

The change from nodeset="developer_rules" to node_set=["developer_rules"] appears to correct the parameter name and structure.

94-94: LGTM: Improved encapsulation by removing global variable.

Moving log_file retrieval from global scope to local function scope improves encapsulation and thread safety.

Also applies to: 176-176, 238-238

459-459: LGTM: Consistent logging initialization.

The change to use setup_logging() provides consistent logging configuration across the application.

cognee/__init__.py (1)

9-16: LGTM: Proper initialization sequence for environment and logging.

The early loading of environment variables ensures that LOG_LEVEL and other configuration values are available when logging is initialized. The override=True parameter ensures environment variables take precedence over any defaults.

cognee-frontend/src/ui/Partials/Footer/Footer.tsx (3)

4-6: LGTM: Proper TypeScript interface definition.

The FooterProps interface provides good type safety for the optional children prop.

10-23: LGTM: Clean Tailwind CSS implementation.

The refactor to use Tailwind utility classes provides a cleaner, more maintainable styling approach compared to CSS modules.

17-21: Verify icon color change aligns with design system.

The icon colors were changed from white to black. Please ensure this change aligns with the overall design system and provides sufficient contrast in all usage contexts.

cognee-frontend/src/app/auth/layout.tsx (2)

6-9: LGTM: Proper Next.js metadata configuration.

The metadata export follows Next.js 13+ app directory conventions correctly.

11-31: LGTM: Clean and well-structured layout component.

The component provides good visual hierarchy with proper spacing, responsive layout using Tailwind classes, and follows React best practices for layout components.
cognee/modules/data/exceptions/exceptions.py (1)

25-42: LGTM! Well-structured exception classes.

The new exception classes follow the established pattern and use appropriate HTTP status codes. The implementation is consistent with existing exceptions in the codebase.

Consider making the error messages more specific by allowing parameterization:
 class DatasetNotFoundError(CogneeApiError):
     def __init__(
         self,
-        message: str = "Dataset not found.",
+        message: str = None,
+        dataset_id: str = None,
         name: str = "DatasetNotFoundError",
         status_code=status.HTTP_404_NOT_FOUND,
     ):
+        if message is None:
+            message = f"Dataset {dataset_id} not found." if dataset_id else "Dataset not found."
         super().__init__(message, name, status_code)
cognee/modules/data/methods/__init__.py (1)

11-22: LGTM! New dataset method imports are properly organized.

The new imports follow the established pattern and are well-categorized. The static analysis warnings about unused imports are false positives - these imports are meant to be re-exported for use by other modules.

The categorization (Get, Create, Check) makes the code more maintainable and follows the existing structure.

cognee-frontend/src/app/auth/login/LoginPage.tsx (2)

8-31: LGTM! Well-structured login page component.

The component follows Next.js best practices with proper use of client-side rendering, Image optimization, and Link navigation. The integration with the AuthForm component provides good separation of concerns.

33-40: LGTM! Proper payload formatting for form-encoded API.

The formatPayload function correctly converts the email/password data to URLSearchParams format, which aligns with typical form-encoded authentication endpoints that expect "username" and "password" fields.

cognee-starter-kit/src/pipelines/custom-model.py (2)

33-48: LGTM! Well-defined data models for graph representation.

The DataPoint classes are properly structured for representing programming languages and their fields. The metadata configuration for index fields is appropriate for search optimization.

Note: The static analysis warnings about "too few public methods" are false positives - these are data classes/models, not traditional classes that need multiple methods.

65-89: LGTM! Comprehensive demonstration of search capabilities.

The code effectively demonstrates multiple search types (graph completion, RAG completion, summaries, and chunks) with proper async/await patterns and clear output formatting.

cognee/modules/users/models/ACL.py (1)

18-22: LGTM! Improved permission architecture with dataset-scoped ACLs.

The shift from data-level to dataset-level permissions is a good architectural decision that simplifies permission management. The CASCADE delete ensures proper cleanup when datasets are removed, and the relationship changes are consistent with the foreign key updates.

cognee-frontend/src/app/page.tsx (1)

1-4: Clean architectural refactor with appropriate dynamic rendering.

The complete refactor from a complex dataset management UI to a simple re-export pattern is well-executed. The dynamic = "force-dynamic" export is appropriate for a graph visualization that likely requires real-time data updates.

cognee/eval_framework/evaluation/deep_eval_adapter.py (2)

10-13: Good addition of required imports for retry functionality.

The time and logging imports are appropriately added to support the new retry mechanism.

75-75: Clean integration of retry mechanism.

The modification to use the new _calculate_metric method properly centralizes the retry logic and error handling.
cognee/modules/users/models/DatasetDatabase.py (1)

7-19: Well-designed SQLAlchemy model with proper relationships and constraints.

The model design follows best practices:

Appropriate use of UUID types for security and uniqueness

Proper foreign key relationships with CASCADE delete behavior

Unique constraints on database names prevent conflicts

Timezone-aware timestamps with proper defaults

The static analysis hint about "too few public methods" is a false positive for SQLAlchemy ORM models, which inherit necessary methods from the Base class.

Verify that the foreign key references exist in the database schema:
#!/bin/bash
# Description: Verify that the referenced tables and columns exist
# Expected: Find principals.id and datasets.id column definitions

echo "Checking for principals table with id column:"
rg -A 10 -B 5 "class.*Principal" --type py

echo -e "\nChecking for datasets table with id column:"
rg -A 10 -B 5 "class.*Dataset" --type py
cognee-frontend/src/app/auth/AuthPage.tsx (1)

12-40: Well-structured conditional rendering with clean UI.

The conditional rendering logic is clean and the UI structure using Tailwind CSS is appropriate. The session-based flow clearly handles both authenticated and unauthenticated states.

Verify the session object structure matches the expected format:
#!/bin/bash
# Description: Check the Auth0 session structure and user properties
# Expected: Find type definitions or documentation for session.user.name

echo "Checking Auth0 session type definitions:"
rg -A 15 -B 5 "interface.*Session|type.*Session" --type ts --type tsx

echo -e "\nChecking auth0 module exports:"
fd "auth0" --type f --exec cat {}
cognee/infrastructure/databases/graph/neo4j_driver/neo4j_metrics_utils.py (2)

59-63: Excellent optimization using GDS stats procedure.

The change from streaming all components to using gds.wcc.stats is a significant performance improvement. The stats procedure directly returns the component count without requiring data transfer and manual aggregation.

183-187: Clean optimization for clustering coefficient calculation.

Similar to the connected components optimization, using gds.localClusteringCoefficient.stats is more efficient than streaming individual coefficients and computing the average manually. The direct stats approach reduces both network overhead and computation time.

cognee/infrastructure/databases/utils/get_or_create_dataset_database.py (1)

14-30: Function signature and documentation look good.

Clear parameter typing and comprehensive docstring explaining the atomic create-or-fetch behavior.

cognee/modules/users/authentication/default/default_jwt_strategy.py (1)

10-23: Clarify the implementation status of this class.

This class appears to be a placeholder with all functionality commented out. Please clarify whether:

This is work-in-progress and should be completed

This is intentionally stubbed for future implementation

The commented code should be removed entirely

If this is production code, consider adding a docstring or TODO comment to indicate the intended status.
cognee/infrastructure/databases/vector/pgvector/create_db_and_tables.py (1)

8-11: ```shell
#!/bin/bash

Search for where the context var is set to confirm the return type of get_vectordb_context_config

rg -C3 "vector_db_config.set" .
</details>
<details>
<summary>cognee/infrastructure/databases/graph/memgraph/memgraph_adapter.py (2)</summary>

`617-617`: **Good standardization of method naming.**

Renaming `get_neighbours` to `get_neighbors` improves consistency with American English spelling conventions used elsewhere in the codebase.

---

`637-654`: **Well-implemented node retrieval methods.**

The new `get_node` and `get_nodes` methods follow established patterns in the class and provide useful functionality for direct node access by ID(s).



The implementations are clean and consistent:
- Proper type hints with `Optional[Dict[str, Any]]` and `List[Dict[str, Any]]`
- Consistent Cypher query patterns
- Appropriate return value handling (None for missing single node, empty list for no matches)

</details>
<details>
<summary>cognee/modules/ingestion/classify.py (2)</summary>

`2-2`: **Good practice: Proper type annotation for optional dependency.**

The change from `S3FileSystem` to `Any` for the `s3fs` parameter is appropriate when dealing with optional dependencies.

---

`8-14`: **Approve the conditional import pattern.**

The try/except import block at module level is a good practice for handling optional dependencies gracefully.

</details>
<details>
<summary>cognee/modules/metrics/operations/get_pipeline_run_metrics.py (2)</summary>

`6-6`: **Good refactoring: Updated import aligns with new architecture.**

The import of `PipelineRunInfo` correctly reflects the new pipeline run streaming architecture.

---

`33-40`: **Good implementation: Proper database query and caching logic.**

The logic correctly checks for existing metrics before creating new ones, which provides good caching behavior and avoids unnecessary computation.

</details>
<details>
<summary>cognee-frontend/src/ui/elements/Select.tsx (3)</summary>

`1-4`: **Good imports: Proper TypeScript and utility imports.**

The imports are well-structured, using appropriate TypeScript types and utility functions.

---

`5-5`: **Excellent TypeScript usage: Proper interface extension.**

The component correctly extends `SelectHTMLAttributes<HTMLSelectElement>` and uses proper destructuring with rest props.

---

`7-18`: **Good implementation: Proper select element with styling.**

The select element implementation is solid with proper class merging and prop spreading.

</details>
<details>
<summary>cognee/modules/graph/methods/get_formatted_graph_data.py (1)</summary>

`6-7`: **Good implementation: Proper context setup for database access control.**

The function correctly sets up database context variables for dataset and user scoping.

</details>
<details>
<summary>cognee/modules/data/methods/load_or_create_datasets.py (3)</summary>

`9-16`: **Good function design: Clear documentation and type hints.**

The function signature and docstring clearly explain the behavior and expectations.

---

`30-32`: **Good error handling: Appropriate exception for missing UUIDs.**

The logic correctly raises `DatasetNotFoundError` when a UUID identifier doesn't match any existing dataset, which is the expected behavior.

---

`35-40`: **Verify unique ID generation doesn't cause race conditions.**

Ensure that `get_unique_dataset_id` handles concurrent calls properly to avoid potential race conditions when multiple users create datasets with the same name simultaneously.


```shell
#!/bin/bash
# Search for the implementation of get_unique_dataset_id to verify its concurrency handling
ast-grep --pattern 'async def get_unique_dataset_id($$$)'
cognee-frontend/src/modules/chat/hooks/useChat.ts (1)

26-26: ```shell
#!/bin/bash

Locate the useChat hook file

echo "File path:"
fd --max-depth 4 useChat.ts

Display the hook signature and dataset parameter usage

echo -e "\nuseChat signature and parameters:"
rg --color=never -n "function useChat" --context 3 cognee-frontend/src/modules/chat/hooks/useChat.ts

echo -e "\nOccurrences of 'dataset' in hook implementation:"
rg --color=never -n "\bdataset\b" --context 2 cognee-frontend/src/modules/chat/hooks/useChat.ts

echo -e "\nOccurrences of 'datasets' in request payload:"
rg --color=never -n "datasets" --context 2 cognee-frontend/src/modules/chat/hooks/useChat.ts
</details>
<details>
<summary>.env.template (2)</summary>

`52-55`: **Excellent documentation for Docker Compose setup.**

The addition of clear guidance for using Postgres with Docker Compose is very helpful and addresses a common deployment scenario.

---

`74-81`: **Well-documented access control feature with appropriate defaults.**

The comprehensive documentation of the `ENABLE_BACKEND_ACCESS_CONTROL` feature, including supported databases and implications, is excellent. The default `False` value is appropriately conservative for a security feature.

</details>
<details>
<summary>cognee/modules/pipelines/queues/pipeline_run_info_queues.py (1)</summary>

`33-37`: **Handle queue exceptions gracefully.**

The `get_from_queue` function could raise `asyncio.QueueEmpty` exception when calling `get_nowait()` on an empty queue, though it's currently handled by the conditional check.



The implementation correctly handles the empty queue case with the conditional check before calling `get_nowait()`.

</details>
<details>
<summary>cognee-frontend/src/ui/elements/TextArea.tsx (1)</summary>

`72-78`: **Potential infinite loop in useEffect.**

The `useEffect` dependency on `value` combined with checking `textAreaText !== value` could create update cycles if the comparison fails due to whitespace or formatting differences.



Add a ref to track the last programmatically set value:

```diff
+const lastSetValueRef = useRef<string>('');
+
 useEffect(() => {
   const fakeTextAreaElement = fakeTextAreaRef.current;
   const textAreaText = fakeTextAreaElement?.innerText;
-  if (fakeTextAreaElement && textAreaText !== value && textAreaText !== placeholder) {
+  if (fakeTextAreaElement && lastSetValueRef.current !== value && textAreaText !== placeholder) {
     fakeTextAreaElement.innerText = value;
+    lastSetValueRef.current = value;
   }
 }, [value]);
Likely an incorrect or invalid review comment.
cognee/modules/pipelines/models/PipelineRunInfo.py (1)

6-13: Well-designed base model with appropriate configuration.

The PipelineRunInfo base class effectively defines the core structure for pipeline run information. The arbitrary_types_allowed configuration is appropriate for the flexible payload field.

alembic/versions/ab7e313804ae_permission_system_rework.py (4)

27-48: Well-structured table definition helper function.

The _define_dataset_table function correctly defines the dataset table schema without depending on application models. This is the recommended approach for Alembic migrations.

82-119: Robust permission management with proper error handling.

The _ensure_permission function handles permission creation efficiently and includes proper fallback logic. The use of bulk_insert is appropriate for single-row inserts in this context.

147-184: Solid migration logic with proper data handling.

The upgrade function follows good practices:

Drops and recreates the table cleanly

Uses helper functions to avoid model dependencies

Handles empty dataset case gracefully

Uses bulk operations for performance

Creates comprehensive permission sets (read, write, share, delete)

187-222: Complete and correct downgrade implementation.

The downgrade function properly reverses the migration by recreating the data-based ACLs table and migrating permissions back. The logic mirrors the upgrade process appropriately.

cognee-frontend/src/app/(graph)/GraphView.tsx (4)

18-27: Well-defined TypeScript interfaces.

The GraphNode and GraphData interfaces provide clear type definitions for the component's data structures.

36-48: Robust data change handler with proper validation.

The onDataChange callback correctly handles both reset (null) and update scenarios with appropriate validation for empty data.

75-80: Smart re-render optimization using key prop.

Using data?.nodes.length as the key for GraphVisualization ensures the component re-renders when the data structure changes significantly. This is a good optimization technique.

101-102: Clean imperative API usage with proper null checks.

The callback functions correctly use the non-null assertion operator (!) which is safe here since the callbacks are only called from child components when the refs are guaranteed to be initialized.

cognee/modules/pipelines/operations/run_tasks.py (5)

2-13: LGTM: Clean import organization for refactored functionality.

The new imports are well-organized and support the transition to structured pipeline run events. The addition of user methods, pipeline utilities, and structured run info models aligns well with the refactoring objectives.

86-89: Excellent defensive programming and pipeline ID generation improvement.

The addition of default user fetching provides good fallback behavior, and the switch from manual UUID5 generation to the generate_pipeline_id utility function improves maintainability and consistency.

95-98: Great improvement: Structured pipeline run events.

The transition from yielding raw pipeline run objects to structured PipelineRunStarted events provides a much cleaner API contract and better type safety for consumers of this function.

108-111: Consistent event structure for pipeline yields.

The use of PipelineRunYield wrapper maintains consistency with the structured event approach, making it easier for consumers to handle different pipeline states.

113-125: Improved error handling with structured events.

The error and completion handling now uses structured events (PipelineRunCompleted, PipelineRunErrored) which provides better consistency and makes it easier for consumers to distinguish between different pipeline states. The error re-raising preserves the original exception while still providing structured feedback.

cognee/context_global_variables.py (2)

12-13: Excellent use of ContextVar for async context isolation.

The use of ContextVar for database configurations is the correct approach for maintaining different database contexts across async tasks and threads. This enables proper isolation for dataset-scoped access control.

51-67: Well-structured database configuration setup.

The database configuration dictionaries are well-structured and the context variable setting provides proper isolation. The separation of vector and graph database configurations supports the multi-database architecture effectively.

cognee-frontend/src/app/(graph)/GraphVisualization.tsx (2)

161-167: Excellent force simulation configuration.

The D3 force configuration with collision detection and charge forces is well-implemented. The parameters provide good balance between node separation and graph stability.

176-225: Good fallback mechanism for graph rendering.

The conditional rendering with a fallback example graph provides a good user experience when no data is available. The example data clearly demonstrates the expected functionality.
cognee/api/v1/search/routers/get_search_router.py (4)

14-19: Excellent documentation and API design for dataset filtering.

The comment clearly explains the distinction between dataset names (owner-scoped) and dataset UUIDs (permission-based access). The optional fields provide flexible search scoping while maintaining security boundaries.

33-33: Consistent route path normalization.

Removing the trailing slash from route decorators improves consistency across the API. This change aligns with similar updates mentioned in other routers.

52-59: Excellent permission-aware search implementation.

The addition of dataset parameters to the search function and the graceful handling of PermissionDeniedError by returning an empty list instead of an error provides a good user experience while maintaining security.

36-36: Verify the impact of limit=0 parameter.

The addition of limit=0 to the get_history call may change the behavior significantly. Ensure this change is intentional and that the search history endpoint should return unlimited results.
#!/bin/bash
# Description: Check the get_history function implementation to understand the limit parameter behavior
ast-grep --pattern 'def get_history($$$)'
cognee/modules/users/methods/get_authenticated_user.py (1)

6-6: Excellent simplification using FastAPI Users.

The migration from manual JWT handling to FastAPI Users' built-in current_user dependency significantly improves maintainability, reduces potential security issues, and follows established patterns. The active=True parameter ensures only active users can authenticate.

cognee-starter-kit/src/pipelines/low_level.py (2)

14-33: Consider adding validation methods to DataPoint classes.

The static analysis correctly identifies that these classes have too few public methods, but this is acceptable for data models. However, consider adding validation methods if business logic requires it.

The data model structure is clean and appropriate for representing the company hierarchy with proper metadata for indexing.

77-125: LGTM! Well-structured async pipeline with proper error isolation.

The main function properly sets up the Cognee environment, handles configuration, and orchestrates the pipeline execution. The use of async generators for pipeline status monitoring is appropriate.

cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (3)

30-30: Good addition of BASE_LABEL constant for consistent node labeling.

This constant ensures all nodes have a common base label, which is essential for implementing proper uniqueness constraints and consistent querying patterns.

51-52: Appropriate configuration to disable Neo4j notifications.

Disabling notifications with notifications_min_severity="OFF" is a good practice for production environments to reduce log noise.

54-60: Well-implemented initialization method with proper constraint creation.

The initialize method correctly creates a uniqueness constraint on the id property for the base label, which is essential for data integrity.

cognee-mcp/README.md (1)

1-36: Excellent visual improvements and comprehensive structure.

The new header with logo, badges, and centered layout significantly improves the visual appeal and professionalism of the documentation.

cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py (3)

54-81: Excellent error handling and logging in API request method.

The _make_request method properly handles HTTP errors, logs detailed information, and provides good error context for debugging.

14-21: Well-implemented UUIDEncoder for JSON serialization.

The custom JSON encoder properly handles UUID objects by converting them to strings, which is essential for API communication.

183-197: Good schema initialization pattern with proper error handling.

The schema initialization logic correctly checks for existing schema and creates it only when needed, with appropriate error handling and logging.

cognee-frontend/types/d3-force-3d.d.ts (2)

10-20: Well-defined 3D simulation node interface.

The SimulationNodeDatum3D interface properly extends the base 2D interface with the necessary 3D position, velocity, and fixed position properties. The optional nature of fx, fy, and fz is appropriate.

22-41: Comprehensive 3D force function declarations.

The exported functions cover all the essential force types for 3D simulations, including center, many-body, link, collision, radial, and directional forces. The type signatures are consistent and properly typed.

cognee/api/v1/add/routers/get_add_router.py (1)

53-55: LGTM! Good improvement to return operation results.

The addition of capturing and returning the add_run result improves the API by providing feedback about the operation. The use of model_dump() properly serializes the result for JSON response.

cognee-frontend/src/ui/Partials/SearchView/SearchView.tsx (1)

90-93: LGTM! Good form submission handling.

The enter key handling with shift key detection for multiline input is well implemented. The use of requestSubmit() ensures proper form validation.
cognee/infrastructure/databases/graph/get_graph_engine.py (4)

5-5: LGTM! Context-aware configuration improves multi-tenancy.

The change from get_graph_config to get_graph_context_config aligns with the PR's objective of implementing dataset-scoped database management. This enables per-request or per-dataset database configurations.

Also applies to: 13-13

21-23: Good addition of async initialization pattern.

The conditional async initialization allows adapters to perform setup operations without requiring all adapters to implement the pattern. This is a clean extension point.

35-39: Parameter reordering may break existing code.

The reordering of parameters in create_graph_engine (moving graph_file_path before connection parameters) could break existing callers that rely on positional arguments.
#!/bin/bash
# Description: Check for any direct calls to create_graph_engine that might be affected
# Expected: Find calls using positional arguments that need updating

rg -A 5 "create_graph_engine\(" --type py
114-125: I want to inspect the full adapter to confirm available methods and error handling. Let’s fetch its contents and look for any async methods:
#!/bin/bash
# Print the adapter file
sed -n '1,200p' cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py

# Check specifically for async methods
rg "async def" -n cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py
cognee-frontend/src/app/(graph)/GraphControls.tsx (1)

147-153: LGTM! Proper cleanup of timeout in useEffect.

The cleanup function properly clears the timeout to prevent memory leaks. The dependency array correctly includes all relevant variables.
cognee/modules/pipelines/operations/pipeline.py (5)

48-53: Verify context variable handling and documentation.

The context variable setup is well-designed for async task isolation. The comment explains the purpose clearly.

90-92: Good error handling for dataset validation.

The explicit check and custom exception provide clear feedback when no datasets are available for processing.

115-117: Ensure database context isolation per dataset.

The context setup before task execution is crucial for backend access control. Good implementation.

166-169: Verify context parameter threading through run_tasks.

Ensure the context parameter is properly handled throughout the task execution chain.
#!/bin/bash
# Description: Verify context parameter is properly threaded through run_tasks

# Check run_tasks function signature and usage
ast-grep --pattern 'def run_tasks($$$, context=$_, $$$)'

# Check if context is used within run_tasks implementation
rg -A 20 "def run_tasks" | head -30
93-102: ```shell
#!/bin/bash

Search for all usages of run_pipeline to identify how it's consumed

rg -Hn "run_pipeline" -A 5 -B 5

Look for any asyncio.gather patterns that might expect full results

rg -Hn "asyncio.gather" -A 5 -B 5

Check for list() or any eager conversion around run_pipeline

rg -Hn "list(.*run_pipeline" -A 5 -B 5
rg -Hn "await run_pipeline" -A 5 -B 5
</details>
<details>
<summary>cognee/modules/search/methods/search.py (2)</summary>

`196-202`: **Good concurrent search implementation.**

The asyncio.gather approach efficiently processes multiple datasets concurrently while maintaining context isolation.

---

`187-193`: #!/bin/bash
# Dump the full implementation of get_or_create_dataset_database for inspection
sed -n '1,200p' cognee/infrastructure/databases/utils/get_or_create_dataset_database.py

</details>
<details>
<summary>cognee/api/v1/permissions/routers/get_permissions_router.py (4)</summary>

`19-19`: **FastAPI Depends usage is correct here.**

The Ruff warning about B008 is a false positive in this context. FastAPI's dependency injection system requires Depends() in function signatures.

---

`35-40`: **Consistent authentication enforcement.**

Good that all endpoints now require authentication. The owner_id pattern is consistently applied.

---

`52-60`: **New tenant membership endpoint looks good.**

The endpoint follows the established pattern with proper authentication and ownership tracking.

---

`23-28`: ```shell
#!/bin/bash
# Correctly retrieve implementation of get_specific_user_permission_datasets
rg -A 50 "def get_specific_user_permission_datasets" cognee/modules/users/permissions/methods
cognee/api/v1/cognify/cognify.py (3)

102-113: Good background processing implementation.

The async task creation and queue management for background processing is well implemented. The StopAsyncIteration handling ensures graceful completion.

128-128: Verify permission system migration.

The change from check_permissions_on_documents to check_permissions_on_dataset aligns with the system-wide permission refactor.

100-105: ```shell
#!/bin/bash
set -e

echo "■ Checking setup.py for python_requires"
rg "python_requires" -n setup.py || echo " (not found)"

echo
echo "■ Checking pyproject.toml for requires-python"
rg "requires-python" -n pyproject.toml || echo " (not found)"

echo
echo "■ Checking setup.cfg for requires-python"
rg "requires-python" -n setup.cfg || echo " (not found)"

echo
echo "■ Inspecting GitHub Actions workflows for Python versions"
rg "python-version" -n .github/workflows || echo " (no workflows or no python-version key)"
</details>
<details>
<summary>cognee/api/v1/cognify/routers/get_cognify_router.py (5)</summary>

`45-48`: **Good input validation for dataset parameters.**

The validation ensures at least one dataset identifier is provided, preventing empty processing requests.

---

`96-121`: **Robust WebSocket loop with proper cleanup.**

The infinite loop with queue polling, error handling, and cleanup on disconnect/completion is well implemented. The 2-second polling interval provides a good balance.

---

`119-121`: **Good WebSocket disconnect handling.**

Proper queue cleanup on WebSocket disconnect prevents memory leaks.

---

`107-113`: **Verify graph data formatting performance.**

The `get_formatted_graph_data` call in the WebSocket loop could be expensive. Consider caching or optimization for real-time updates.


```shell
#!/bin/bash
# Description: Check graph data formatting implementation for performance

# Find get_formatted_graph_data implementation
ast-grep --pattern 'def get_formatted_graph_data($$$)'

# Look for any caching or optimization patterns
rg -A 20 "get_formatted_graph_data"
67-88: ```shell
#!/bin/bash

Display the implementation of get_authenticated_user to verify token validation logic

sed -n '1,200p' cognee/modules/users/methods/get_authenticated_user.py
</details>
<details>
<summary>cognee/modules/ontology/rdf_xml/OntologyResolver.py (6)</summary>

`6-6`: **Import change looks good for rdflib migration.**

The import correctly brings in the necessary rdflib components to replace owlready2 functionality.

---

`17-34`: **Well-designed wrapper class for ontology nodes.**

The `AttachedOntologyNode` class provides a clean abstraction that handles URI name extraction properly, supporting both "#" and "/" separators. The implementation is straightforward and the `__repr__` method aids debugging.

---

`37-54`: **Good handling of optional ontology file loading.**

The constructor properly handles cases where the ontology file doesn't exist by setting `self.graph = None` and logging an informative message. Error handling with proper exception chaining is also well implemented.

---

`55-62`: **URI normalization logic is correct.**

The `_uri_to_key` method properly extracts names from URIs and normalizes them consistently. The handling of both "#" and "/" separators covers the common URI patterns.

---

`63-98`: **Refactored lookup building logic looks correct.**

The method properly handles the case where `self.graph` is None and correctly identifies classes and individuals using RDF triple patterns. The logic for finding individuals by checking if their type is a known class is sound.

---

`119-125`: **Category detection method is well-implemented.**

The `_get_category` helper method provides a clean way to determine whether a URI represents a class or individual by checking against the lookup dictionaries.

</details>
<details>
<summary>cognee/api/v1/datasets/routers/get_datasets_router.py (4)</summary>

`50-69`: **Well-structured DTO classes for graph data.**

The new Pydantic models (`GraphNodeDTO`, `GraphEdgeDTO`, `GraphDTO`, `DatasetCreationPayload`) provide clear structure for graph data and dataset creation requests. The use of `OutDTO` and `InDTO` base classes maintains consistency with the API design.

---

`87-114`: **Dataset creation endpoint with proper permission handling.**

The new endpoint correctly checks for existing datasets before creation and assigns full permissions to the creator. The implementation properly handles the database session and includes appropriate error handling.

---

`74-86`: **Permission-based dataset retrieval is secure.**

The modified `get_datasets` endpoint now properly filters datasets based on user permissions, which is a significant security improvement over the previous implementation.

---

`125-125`: **Improved exception handling with specific error types.**

The use of `DatasetNotFoundError` and `DataNotFoundError` instead of generic exceptions provides better error categorization and makes debugging easier. The error messages are also informative.



Also applies to: 144-144, 149-149, 226-226, 232-234, 239-241

</details>

</blockquote></details>

</details>

cognee/modules/users/models/Tenant.py

Comment on lines +14 to +15

		owner_id = Column(UUID, index=True)

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Address schema design and migration concerns for the new owner_id column.

The addition of owner_id introduces important functionality, but several design aspects need clarification:

Missing foreign key constraint - Should this reference User.id?
Nullable specification - Is owner required or optional for tenants?
Migration strategy - How will existing tenant records handle this new column?
Missing relationship - Consider adding a relationship to the owner User.

Consider this enhanced implementation:

-    owner_id = Column(UUID, index=True)
+    owner_id = Column(UUID, ForeignKey("users.id"), nullable=True, index=True)
+    
+    # Relationship to owner user
+    owner = relationship("User", foreign_keys=[owner_id])

Also ensure proper database migration scripts are in place for existing tenants.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                owner_id = Column(UUID, index=True)
          
                owner_id = Column(UUID, ForeignKey("users.id"), nullable=True, index=True)
          
                # Relationship to owner user
          
                owner = relationship("User", foreign_keys=[owner_id])

🤖 Prompt for AI Agents

In cognee/modules/users/models/Tenant.py around lines 14 to 15, the owner_id
column lacks a foreign key constraint, nullable specification, and a
relationship to the User model. Fix this by adding a ForeignKey constraint
referencing User.id, specify whether owner_id is nullable based on whether the
owner is required or optional, and add a SQLAlchemy relationship to the User
model for owner. Additionally, create and apply a database migration script to
add this column to existing tenant records, handling default values or nulls as
appropriate.

cognee/modules/retrieval/utils/description_to_codepart_search.py

    
              if __name__ == "__main__":

                  logger = setup_logging(log_level=ERROR)

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Misuse of setup_logging overriding logger
Assigning logger = setup_logging(...) likely masks the module‐level logger and assumes setup_logging returns a logger. Instead, call setup_logging(log_level=ERROR) to configure global logging, then use get_logger(level=ERROR) to retrieve the logger.

🤖 Prompt for AI Agents

In cognee/modules/retrieval/utils/description_to_codepart_search.py at line 147,
the code incorrectly assigns the result of setup_logging(log_level=ERROR) to the
variable logger, which masks the module-level logger and assumes setup_logging
returns a logger. To fix this, call setup_logging(log_level=ERROR) without
assignment to configure global logging, then separately call
get_logger(level=ERROR) to retrieve and assign the logger variable.

cognee/modules/data/methods/check_dataset_name.py

Comment on lines +1 to +3

    
              def check_dataset_name(dataset_name: str):

                  if "." in dataset_name or " " in dataset_name:

                      raise ValueError("Dataset name cannot contain spaces or underscores")

Contributor

coderabbitai bot Jun 18, 2025

⚠️ Potential issue

Fix inconsistency between validation logic and error message.

The function checks for spaces and periods in the dataset name, but the error message mentions "spaces or underscores". This inconsistency could confuse users.

Apply this diff to align the error message with the validation logic:

-        raise ValueError("Dataset name cannot contain spaces or underscores")
+        raise ValueError("Dataset name cannot contain spaces or periods")

Alternatively, if underscores should also be forbidden, update the validation logic:

-    if "." in dataset_name or " " in dataset_name:
+    if "." in dataset_name or " " in dataset_name or "_" in dataset_name:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
            def check_dataset_name(dataset_name: str):
          
                if "." in dataset_name or " " in dataset_name:
          
                    raise ValueError("Dataset name cannot contain spaces or underscores")
          
            def check_dataset_name(dataset_name: str):
          
                if "." in dataset_name or " " in dataset_name:
          
                    raise ValueError("Dataset name cannot contain spaces or periods")

🤖 Prompt for AI Agents

In cognee/modules/data/methods/check_dataset_name.py lines 1 to 3, the error
message incorrectly states that dataset names cannot contain spaces or
underscores, while the validation logic actually checks for spaces and periods.
To fix this, either update the error message to say "Dataset name cannot contain
spaces or periods" to match the current checks, or modify the validation logic
to also check for underscores if those should be forbidden as well.

cognee-frontend/src/app/globals.css

    
                text-decoration: none;

              }

              @import "tailwindcss";

Contributor

coderabbitai bot Jun 18, 2025

⚠️ Potential issue

Fix @import positioning to comply with CSS standards.

The @import rule must be placed at the beginning of the CSS file (after any @charset or @layer rules) to be valid. Currently, it's positioned after other CSS rules, which could cause browsers to ignore it.

Move the import to the top of the file:

+@import "tailwindcss";
+
 :root {
   --max-width: 1100px;
   --border-radius: 2px;
   --font-mono: ui-monospace, Menlo, Monaco, "Cascadia Mono", "Segoe UI Mono",
     "Roboto Mono", "Oxygen Mono", "Ubuntu Monospace", "Source Code Pro",
     "Fira Mono", "Droid Sans Mono", "Courier New", monospace;
 
   --button-padding: 14px 20px !important;
   --button-border-radius: 100px !important;
   --global-color-primary: #6510F4 !important;
   --global-color-primary-active: #500cc5 !important;
   --global-color-primary-text: white !important;
   --global-color-secondary: #0DFF00 !important;
   --global-background-default: #0D051C;
   --textarea-default-color: #0D051C !important;
 }
 
 html,
 body {
   height: 100%;
   max-width: 100vw;
   overflow-x: hidden;
 }
 
 a {
   color: inherit;
   text-decoration: none;
 }
-
-@import "tailwindcss";

🧰 Tools

🪛 Biome (1.9.4)

[error] 30-30: This @import is in the wrong position.

Any @import rules must precede all other valid at-rules and style rules in a stylesheet (ignoring @charset and @layer), or else the @import rule is invalid.
Consider moving import position.

(lint/correctness/noInvalidPositionAtImportRule)

🤖 Prompt for AI Agents

In cognee-frontend/src/app/globals.css at line 30, the @import "tailwindcss";
rule is placed after other CSS rules, which violates CSS standards. Move this
@import statement to the very top of the file, before any other CSS rules,
except for any @charset or @layer rules if present, to ensure it is properly
recognized by browsers.

cognee/modules/pipelines/methods/get_pipeline_run.py

Comment on lines +9 to +15

    
              async def get_pipeline_run(pipeline_run_id: UUID):

                  db_engine = get_relational_engine()

                  async with db_engine.get_async_session() as session:

                      query = select(PipelineRun).filter(PipelineRun.pipeline_run_id == pipeline_run_id)

                      return await session.scalar(query)

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Add return type annotation and consider error handling.

The function is missing a return type annotation and lacks explicit error handling for potential database connection issues.

+from typing import Optional
+
-async def get_pipeline_run(pipeline_run_id: UUID):
+async def get_pipeline_run(pipeline_run_id: UUID) -> Optional[PipelineRun]:
+    """Retrieve a pipeline run by its UUID.
+    
+    Args:
+        pipeline_run_id: The UUID of the pipeline run to retrieve
+        
+    Returns:
+        The PipelineRun instance if found, None otherwise
+    """
     db_engine = get_relational_engine()

     async with db_engine.get_async_session() as session:
         query = select(PipelineRun).filter(PipelineRun.pipeline_run_id == pipeline_run_id)

         return await session.scalar(query)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
            async def get_pipeline_run(pipeline_run_id: UUID):
          
                db_engine = get_relational_engine()
          
                async with db_engine.get_async_session() as session:
          
                    query = select(PipelineRun).filter(PipelineRun.pipeline_run_id == pipeline_run_id)
          
                    return await session.scalar(query)
          
            from typing import Optional
          
            async def get_pipeline_run(pipeline_run_id: UUID) -> Optional[PipelineRun]:
          
                """Retrieve a pipeline run by its UUID.
          
                Args:
          
                    pipeline_run_id: The UUID of the pipeline run to retrieve
          
                Returns:
          
                    The PipelineRun instance if found, None otherwise
          
                """
          
                db_engine = get_relational_engine()
          
                async with db_engine.get_async_session() as session:
          
                    query = select(PipelineRun).filter(PipelineRun.pipeline_run_id == pipeline_run_id)
          
                    return await session.scalar(query)

🤖 Prompt for AI Agents

In cognee/modules/pipelines/methods/get_pipeline_run.py around lines 9 to 15,
add a return type annotation to the get_pipeline_run function to specify the
expected return type, such as Optional[PipelineRun]. Additionally, wrap the
database query logic in a try-except block to catch and handle potential
database connection errors, logging or raising an appropriate exception to
ensure robust error handling.

cognee/modules/metrics/operations/get_pipeline_run_metrics.py

Comment on lines +41 to +58

    
                          graph_metrics = await graph_engine.get_graph_metrics(include_optional)

                          metrics = GraphMetrics(

                              id=pipeline_run.pipeline_run_id,

                              num_tokens=await fetch_token_count(db_engine),

                              num_nodes=graph_metrics["num_nodes"],

                              num_edges=graph_metrics["num_edges"],

                              mean_degree=graph_metrics["mean_degree"],

                              edge_density=graph_metrics["edge_density"],

                              num_connected_components=graph_metrics["num_connected_components"],

                              sizes_of_connected_components=graph_metrics["sizes_of_connected_components"],

                              num_selfloops=graph_metrics["num_selfloops"],

                              diameter=graph_metrics["diameter"],

                              avg_shortest_path_length=graph_metrics["avg_shortest_path_length"],

                              avg_clustering=graph_metrics["avg_clustering"],

                          )

                          existing_metrics = existing_metrics.scalars().first()

                          if existing_metrics:

                              metrics_for_pipeline_runs.append(existing_metrics)

                          else:

                              graph_metrics = await graph_engine.get_graph_metrics(include_optional)

                              metrics = GraphMetrics(

                                  id=pipeline_run.pipeline_run_id,

                                  num_tokens=await fetch_token_count(db_engine),

                                  num_nodes=graph_metrics["num_nodes"],

                                  num_edges=graph_metrics["num_edges"],

                                  mean_degree=graph_metrics["mean_degree"],

                                  edge_density=graph_metrics["edge_density"],

                                  num_connected_components=graph_metrics["num_connected_components"],

                                  sizes_of_connected_components=graph_metrics["sizes_of_connected_components"],

                                  num_selfloops=graph_metrics["num_selfloops"],

                                  diameter=graph_metrics["diameter"],

                                  avg_shortest_path_length=graph_metrics["avg_shortest_path_length"],

                                  avg_clustering=graph_metrics["avg_clustering"],

                              )

                              metrics_for_pipeline_runs.append(metrics)

                              session.add(metrics)

                          metrics_for_pipeline_runs.append(metrics)

                          session.add(metrics)

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Add error handling for graph metrics retrieval.

Consider adding error handling for the graph engine operations, as network or database issues could cause failures.

        else:
-            graph_metrics = await graph_engine.get_graph_metrics(include_optional)
+            try:
+                graph_metrics = await graph_engine.get_graph_metrics(include_optional)
+            except Exception as e:
+                # Log the error and either re-raise or return None/empty metrics
+                # depending on your error handling strategy
+                raise RuntimeError(f"Failed to retrieve graph metrics: {e}") from e
+            
             metrics = GraphMetrics(

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                        graph_metrics = await graph_engine.get_graph_metrics(include_optional)
          
                        metrics = GraphMetrics(
          
                            id=pipeline_run.pipeline_run_id,
          
                            num_tokens=await fetch_token_count(db_engine),
          
                            num_nodes=graph_metrics["num_nodes"],
          
                            num_edges=graph_metrics["num_edges"],
          
                            mean_degree=graph_metrics["mean_degree"],
          
                            edge_density=graph_metrics["edge_density"],
          
                            num_connected_components=graph_metrics["num_connected_components"],
          
                            sizes_of_connected_components=graph_metrics["sizes_of_connected_components"],
          
                            num_selfloops=graph_metrics["num_selfloops"],
          
                            diameter=graph_metrics["diameter"],
          
                            avg_shortest_path_length=graph_metrics["avg_shortest_path_length"],
          
                            avg_clustering=graph_metrics["avg_clustering"],
          
                        )
          
                        existing_metrics = existing_metrics.scalars().first()
          
                        if existing_metrics:
          
                            metrics_for_pipeline_runs.append(existing_metrics)
          
                        else:
          
                            graph_metrics = await graph_engine.get_graph_metrics(include_optional)
          
                            metrics = GraphMetrics(
          
                                id=pipeline_run.pipeline_run_id,
          
                                num_tokens=await fetch_token_count(db_engine),
          
                                num_nodes=graph_metrics["num_nodes"],
          
                                num_edges=graph_metrics["num_edges"],
          
                                mean_degree=graph_metrics["mean_degree"],
          
                                edge_density=graph_metrics["edge_density"],
          
                                num_connected_components=graph_metrics["num_connected_components"],
          
                                sizes_of_connected_components=graph_metrics["sizes_of_connected_components"],
          
                                num_selfloops=graph_metrics["num_selfloops"],
          
                                diameter=graph_metrics["diameter"],
          
                                avg_shortest_path_length=graph_metrics["avg_shortest_path_length"],
          
                                avg_clustering=graph_metrics["avg_clustering"],
          
                            )
          
                            metrics_for_pipeline_runs.append(metrics)
          
                            session.add(metrics)
          
                        metrics_for_pipeline_runs.append(metrics)
          
                        session.add(metrics)
          
                    else:
          
                        try:
          
                            graph_metrics = await graph_engine.get_graph_metrics(include_optional)
          
                        except Exception as e:
          
                            # Log the error and either re-raise or return None/empty metrics
          
                            # depending on your error handling strategy
          
                            raise RuntimeError(f"Failed to retrieve graph metrics: {e}") from e
          
                        metrics = GraphMetrics(
          
                            id=pipeline_run.pipeline_run_id,
          
                            num_tokens=await fetch_token_count(db_engine),
          
                            num_nodes=graph_metrics["num_nodes"],
          
                            num_edges=graph_metrics["num_edges"],
          
                            mean_degree=graph_metrics["mean_degree"],
          
                            edge_density=graph_metrics["edge_density"],
          
                            num_connected_components=graph_metrics["num_connected_components"],
          
                            sizes_of_connected_components=graph_metrics["sizes_of_connected_components"],
          
                            num_selfloops=graph_metrics["num_selfloops"],
          
                            diameter=graph_metrics["diameter"],
          
                            avg_shortest_path_length=graph_metrics["avg_shortest_path_length"],
          
                            avg_clustering=graph_metrics["avg_clustering"],
          
                        )
          
                        metrics_for_pipeline_runs.append(metrics)
          
                        session.add(metrics)

🤖 Prompt for AI Agents

In cognee/modules/metrics/operations/get_pipeline_run_metrics.py around lines 41
to 58, the code calls graph_engine.get_graph_metrics without error handling,
which could cause unhandled exceptions if network or database issues occur. Wrap
the call to get_graph_metrics and subsequent processing in a try-except block to
catch potential exceptions, log the error details, and handle the failure
gracefully, such as by skipping the current pipeline run or returning a default
value.

cognee/modules/graph/methods/get_formatted_graph_data.py

Comment on lines +9 to +10

    
                  graph_client = await get_graph_engine()

                  (nodes, edges) = await graph_client.get_graph_data()

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Add error handling for database operations.

Consider adding error handling for the graph engine operations, as they could fail due to network or database issues.

-    graph_client = await get_graph_engine()
-    (nodes, edges) = await graph_client.get_graph_data()
+    try:
+        graph_client = await get_graph_engine()
+        (nodes, edges) = await graph_client.get_graph_data()
+    except Exception as e:
+        raise RuntimeError(f"Failed to retrieve graph data: {e}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                graph_client = await get_graph_engine()
          
                (nodes, edges) = await graph_client.get_graph_data()
          
                try:
          
                    graph_client = await get_graph_engine()
          
                    (nodes, edges) = await graph_client.get_graph_data()
          
                except Exception as e:
          
                    raise RuntimeError(f"Failed to retrieve graph data: {e}") from e

🤖 Prompt for AI Agents

In cognee/modules/graph/methods/get_formatted_graph_data.py around lines 9 to
10, the calls to get_graph_engine() and graph_client.get_graph_data() lack error
handling, which could cause unhandled exceptions if network or database issues
occur. Wrap these calls in a try-except block to catch potential exceptions, log
or handle the errors appropriately, and ensure the function can fail gracefully
or retry as needed.

cognee/modules/data/methods/get_authorized_existing_datasets.py

Comment on lines +25 to +39

    
                  if datasets:

                      # Function handles transforming dataset input to dataset IDs (if possible)

                      dataset_ids = await get_dataset_ids(datasets, user)

                      # If dataset_ids are provided filter these datasets based on what user has permission for.

                      if dataset_ids:

                          existing_datasets = await get_specific_user_permission_datasets(

                              user.id, permission_type, dataset_ids

                          )

                      else:

                          existing_datasets = []

                  else:

                      # If no datasets are provided, work with all existing datasets user has permission for.

                      existing_datasets = await get_all_user_permission_datasets(user, permission_type)

                  return existing_datasets

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Add error handling for helper function calls.

The function doesn't handle potential exceptions from get_dataset_ids, get_specific_user_permission_datasets, or get_all_user_permission_datasets, which could cause unhandled errors to propagate.

     if datasets:
         # Function handles transforming dataset input to dataset IDs (if possible)
-        dataset_ids = await get_dataset_ids(datasets, user)
+        try:
+            dataset_ids = await get_dataset_ids(datasets, user)
+        except Exception as e:
+            # Log error and return empty list or re-raise depending on requirements
+            return []
         # If dataset_ids are provided filter these datasets based on what user has permission for.
         if dataset_ids:
-            existing_datasets = await get_specific_user_permission_datasets(
-                user.id, permission_type, dataset_ids
-            )
+            try:
+                existing_datasets = await get_specific_user_permission_datasets(
+                    user.id, permission_type, dataset_ids
+                )
+            except Exception as e:
+                return []
         else:
             existing_datasets = []
     else:
         # If no datasets are provided, work with all existing datasets user has permission for.
-        existing_datasets = await get_all_user_permission_datasets(user, permission_type)
+        try:
+            existing_datasets = await get_all_user_permission_datasets(user, permission_type)
+        except Exception as e:
+            return []

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                if datasets:
          
                    # Function handles transforming dataset input to dataset IDs (if possible)
          
                    dataset_ids = await get_dataset_ids(datasets, user)
          
                    # If dataset_ids are provided filter these datasets based on what user has permission for.
          
                    if dataset_ids:
          
                        existing_datasets = await get_specific_user_permission_datasets(
          
                            user.id, permission_type, dataset_ids
          
                        )
          
                    else:
          
                        existing_datasets = []
          
                else:
          
                    # If no datasets are provided, work with all existing datasets user has permission for.
          
                    existing_datasets = await get_all_user_permission_datasets(user, permission_type)
          
                return existing_datasets
          
                if datasets:
          
                    # Function handles transforming dataset input to dataset IDs (if possible)
          
                    try:
          
                        dataset_ids = await get_dataset_ids(datasets, user)
          
                    except Exception as e:
          
                        # Log error and return empty list or re-raise depending on requirements
          
                        return []
          
                    # If dataset_ids are provided filter these datasets based on what user has permission for.
          
                    if dataset_ids:
          
                        try:
          
                            existing_datasets = await get_specific_user_permission_datasets(
          
                                user.id, permission_type, dataset_ids
          
                            )
          
                        except Exception as e:
          
                            return []
          
                    else:
          
                        existing_datasets = []
          
                else:
          
                    # If no datasets are provided, work with all existing datasets user has permission for.
          
                    try:
          
                        existing_datasets = await get_all_user_permission_datasets(user, permission_type)
          
                    except Exception as e:
          
                        return []
          
                return existing_datasets

🤖 Prompt for AI Agents

In cognee/modules/data/methods/get_authorized_existing_datasets.py around lines
25 to 39, add try-except blocks around the calls to get_dataset_ids,
get_specific_user_permission_datasets, and get_all_user_permission_datasets to
catch potential exceptions. Handle these exceptions appropriately, such as
logging the error and returning an empty list or a suitable fallback, to prevent
unhandled errors from propagating and crashing the function.

cognee-frontend/src/modules/datasets/cognifyDataset.ts

Comment on lines +10 to +31

    
              export default async function cognifyDataset(dataset: Dataset, onUpdate: (data: GraphData) => void) {

                // const data = await (

                return fetch("/v1/cognify", {

                  method: "POST",

                  headers: {

                    'Content-Type': 'application/json',

                    "Content-Type": "application/json",

                  },

                  body: JSON.stringify({

                    datasets: [dataset.id || dataset.name],

                    datasetIds: [dataset.id],

                    runInBackground: false,

                  }),

                }).then((response) => response.json());

                })

                .then((response) => response.json())

                .then(() => {

                  return getDatasetGraph(dataset)

                    .then((data) => {

                      onUpdate({

                        nodes: data.nodes,

                        edges: data.edges,

                      });

                    });

                });

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Add error handling and improve return consistency.

The function lacks error handling and has an inconsistent return pattern that could cause issues if the API call or graph fetching fails.

-export default async function cognifyDataset(dataset: Dataset, onUpdate: (data: GraphData) => void) {
-  // const data = await (
-  return fetch("/v1/cognify", {
+export default async function cognifyDataset(dataset: Dataset, onUpdate: (data: GraphData) => void): Promise<void> {
+  if (!dataset?.id) {
+    throw new Error("Dataset ID is required");
+  }
+
+  try {
+    await fetch("/v1/cognify", {
       method: "POST",
       headers: {
         "Content-Type": "application/json",
       },
       body: JSON.stringify({
         datasetIds: [dataset.id],
         runInBackground: false,
       }),
-    })
-    .then((response) => response.json())
-    .then(() => {
-      return getDatasetGraph(dataset)
-        .then((data) => {
-          onUpdate({
-            nodes: data.nodes,
-            edges: data.edges,
-          });
-        });
-    });
-  // )
+    });
+
+    const data = await getDatasetGraph(dataset);
+    onUpdate({
+      nodes: data.nodes,
+      edges: data.edges,
+    });
+  } catch (error) {
+    console.error("Failed to cognify dataset:", error);
+    throw error;
+  }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
            export default async function cognifyDataset(dataset: Dataset, onUpdate: (data: GraphData) => void) {
          
              // const data = await (
          
              return fetch("/v1/cognify", {
          
                method: "POST",
          
                headers: {
          
                  'Content-Type': 'application/json',
          
                  "Content-Type": "application/json",
          
                },
          
                body: JSON.stringify({
          
                  datasets: [dataset.id || dataset.name],
          
                  datasetIds: [dataset.id],
          
                  runInBackground: false,
          
                }),
          
              }).then((response) => response.json());
          
              })
          
              .then((response) => response.json())
          
              .then(() => {
          
                return getDatasetGraph(dataset)
          
                  .then((data) => {
          
                    onUpdate({
          
                      nodes: data.nodes,
          
                      edges: data.edges,
          
                    });
          
                  });
          
              });
          
            export default async function cognifyDataset(
          
              dataset: Dataset,
          
              onUpdate: (data: GraphData) => void
          
            ): Promise<void> {
          
              if (!dataset?.id) {
          
                throw new Error("Dataset ID is required");
          
              }
          
              try {
          
                await fetch("/v1/cognify", {
          
                  method: "POST",
          
                  headers: {
          
                    "Content-Type": "application/json",
          
                  },
          
                  body: JSON.stringify({
          
                    datasetIds: [dataset.id],
          
                    runInBackground: false,
          
                  }),
          
                });
          
                const data = await getDatasetGraph(dataset);
          
                onUpdate({
          
                  nodes: data.nodes,
          
                  edges: data.edges,
          
                });
          
              } catch (error) {
          
                console.error("Failed to cognify dataset:", error);
          
                throw error;
          
              }
          
            }

🤖 Prompt for AI Agents

In cognee-frontend/src/modules/datasets/cognifyDataset.ts around lines 10 to 31,
the function lacks error handling and returns inconsistently, which can cause
issues if the fetch or graph retrieval fails. Add try-catch blocks or use
promise catch handlers to handle errors from both the fetch call and
getDatasetGraph. Also, ensure the function consistently returns a value or a
promise, such as returning the final data or a status, to maintain predictable
behavior.

cognee-frontend/src/utils/handleServerErrors.ts

Comment on lines +3 to 25

    
              export default function handleServerErrors(response: Response, retry?: (response: Response) => Promise<Response>): Promise<Response> {

                return new Promise((resolve, reject) => {

                  if (response.status === 401) {

                    window.location.href = '/auth';

                    return;

                    if (retry) {

                      return retry(response)

                        .catch(() => {

                          return redirect("/auth/login");

                        });

                    } else {

                      return redirect("/auth/login");

                    }

                  }

                  if (!response.ok) {

                    return response.json().then(error => reject(error));

                  }

                  return resolve(response);

                  if (response.status >= 200 && response.status < 300) {

                    return resolve(response);

                  }

                  return reject(response);

                });

              }

Contributor

coderabbitai bot Jun 18, 2025

⚠️ Potential issue

Fix promise handling and simplify logic.

The function has several issues with promise handling and redundant logic that could cause unexpected behavior.

-export default function handleServerErrors(response: Response, retry?: (response: Response) => Promise<Response>): Promise<Response> {
-  return new Promise((resolve, reject) => {
-    if (response.status === 401) {
-      if (retry) {
-        return retry(response)
-          .catch(() => {
-            return redirect("/auth/login");
-          });
-      } else {
-        return redirect("/auth/login");
-      }
-    }
-    if (!response.ok) {
-      return response.json().then(error => reject(error));
-    }
-
-    if (response.status >= 200 && response.status < 300) {
-      return resolve(response);
-    }
-
-    return reject(response);
-  });
-}
+export default async function handleServerErrors(response: Response, retry?: (response: Response) => Promise<Response>): Promise<Response> {
+  if (response.status === 401) {
+    if (retry) {
+      try {
+        return await retry(response);
+      } catch {
+        redirect("/auth/login");
+        return response; // This won't be reached due to redirect
+      }
+    } else {
+      redirect("/auth/login");
+      return response; // This won't be reached due to redirect
+    }
+  }
+
+  if (!response.ok) {
+    const error = await response.json();
+    throw error;
+  }
+
+  return response;
+}

🤖 Prompt for AI Agents

In cognee-frontend/src/utils/handleServerErrors.ts lines 3 to 25, the promise
handling is inconsistent and the logic is redundant, which may cause unexpected
behavior. Refactor the function to avoid mixing explicit Promise construction
with returning promises inside it. Use async/await or chain promises properly to
handle the 401 status with retry logic, handle non-ok responses by rejecting
with parsed error JSON, and resolve or reject based on response status cleanly
without redundant checks.

coderabbitai bot reviewed

View reviewed changes

Contributor

coderabbitai bot left a comment

Review continued from previous batch...

cognee/api/v1/add/add.py

Comment on lines +15 to +17

    
                  vector_db_config: dict = None,

                  graph_db_config: dict = None,

                  dataset_id: UUID = None,

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Add proper type hints and parameter validation.

The new parameters should be properly typed as Optional and validated.

     user: User = None,
     node_set: Optional[List[str]] = None,
-    vector_db_config: dict = None,
-    graph_db_config: dict = None,
-    dataset_id: UUID = None,
+    vector_db_config: Optional[dict] = None,
+    graph_db_config: Optional[dict] = None,
+    dataset_id: Optional[UUID] = None,
 ):
+    # Validate that either dataset_name or dataset_id is provided
+    if not dataset_name and not dataset_id:
+        raise ValueError("Either dataset_name or dataset_id must be provided")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                vector_db_config: dict = None,
          
                graph_db_config: dict = None,
          
                dataset_id: UUID = None,
          
                user: User = None,
          
                node_set: Optional[List[str]] = None,
          
                vector_db_config: Optional[dict] = None,
          
                graph_db_config: Optional[dict] = None,
          
                dataset_id: Optional[UUID] = None,
          
            ):
          
                # Validate that either dataset_name or dataset_id is provided
          
                if not dataset_name and not dataset_id:
          
                    raise ValueError("Either dataset_name or dataset_id must be provided")

🤖 Prompt for AI Agents

In cognee/api/v1/add/add.py around lines 15 to 17, the parameters
vector_db_config, graph_db_config, and dataset_id lack proper Optional type
hints and validation. Update their type hints to use Optional from typing (e.g.,
Optional[dict] and Optional[UUID]) and add validation logic to check if the
parameters are None or of the expected type before use, raising appropriate
errors if validation fails.

cognee/api/v1/add/add.py

Comment on lines +24 to +37

    
                  pipeline_run_info = None

                  async for run_info in cognee_pipeline(

                      tasks=tasks,

                      datasets=dataset_id if dataset_id else dataset_name,

                      data=data,

                      user=user,

                      pipeline_name="add_pipeline",

                      vector_db_config=vector_db_config,

                      graph_db_config=graph_db_config,

                  ):

                      pipeline_run_info = run_info

                  return pipeline_run_info

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Add error handling and ensure pipeline_run_info is not None.

The async iteration lacks error handling and could return None if the pipeline yields no results.

     pipeline_run_info = None

-    async for run_info in cognee_pipeline(
-        tasks=tasks,
-        datasets=dataset_id if dataset_id else dataset_name,
-        data=data,
-        user=user,
-        pipeline_name="add_pipeline",
-        vector_db_config=vector_db_config,
-        graph_db_config=graph_db_config,
-    ):
-        pipeline_run_info = run_info
+    try:
+        async for run_info in cognee_pipeline(
+            tasks=tasks,
+            datasets=dataset_id if dataset_id else dataset_name,
+            data=data,
+            user=user,
+            pipeline_name="add_pipeline",
+            vector_db_config=vector_db_config,
+            graph_db_config=graph_db_config,
+        ):
+            pipeline_run_info = run_info
+    except Exception as e:
+        # Log the error and re-raise with context
+        raise RuntimeError(f"Pipeline execution failed: {str(e)}") from e

-    return pipeline_run_info
+    if pipeline_run_info is None:
+        raise RuntimeError("Pipeline completed but no run info was generated")
+    
+    return pipeline_run_info

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                pipeline_run_info = None
          
                async for run_info in cognee_pipeline(
          
                    tasks=tasks,
          
                    datasets=dataset_id if dataset_id else dataset_name,
          
                    data=data,
          
                    user=user,
          
                    pipeline_name="add_pipeline",
          
                    vector_db_config=vector_db_config,
          
                    graph_db_config=graph_db_config,
          
                ):
          
                    pipeline_run_info = run_info
          
                return pipeline_run_info
          
                pipeline_run_info = None
          
                try:
          
                    async for run_info in cognee_pipeline(
          
                        tasks=tasks,
          
                        datasets=dataset_id if dataset_id else dataset_name,
          
                        data=data,
          
                        user=user,
          
                        pipeline_name="add_pipeline",
          
                        vector_db_config=vector_db_config,
          
                        graph_db_config=graph_db_config,
          
                    ):
          
                        pipeline_run_info = run_info
          
                except Exception as e:
          
                    # Log the error and re-raise with context
          
                    raise RuntimeError(f"Pipeline execution failed: {str(e)}") from e
          
                if pipeline_run_info is None:
          
                    raise RuntimeError("Pipeline completed but no run info was generated")
          
                return pipeline_run_info

🤖 Prompt for AI Agents

In cognee/api/v1/add/add.py around lines 24 to 37, add error handling around the
async for loop that iterates over cognee_pipeline to catch and handle any
exceptions that may occur during iteration. After the loop, check if
pipeline_run_info is still None, and if so, raise an appropriate error or return
a meaningful response to ensure the function does not return None silently.

cognee-frontend/src/app/(graph)/CogneeAddWidget.tsx

Comment on lines +77 to +91

    
                  return addData(dataset, files)

                    .then(() => {

                      const onUpdate = (data: NodesAndEdges) => {

                        onData({

                          nodes: data.nodes,

                          links: data.edges,

                        });

                        setProcessingFilesDone();

                      };

                      return cognifyDataset(dataset, onUpdate)

                        .then(() => {

                          refreshDatasets();

                        });

                    });

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Improve error handling in file upload chain.

The promise chain lacks proper error handling, which could leave the component in a processing state indefinitely.

-    return addData(dataset, files)
-      .then(() => {
-        const onUpdate = (data: NodesAndEdges) => {
-          onData({
-            nodes: data.nodes,
-            links: data.edges,
-          });
-          setProcessingFilesDone();
-        };
-
-        return cognifyDataset(dataset, onUpdate)
-          .then(() => {
-            refreshDatasets();
-          });
-      });
+    return addData(dataset, files)
+      .then(() => {
+        const onUpdate = (data: NodesAndEdges) => {
+          onData({
+            nodes: data.nodes,
+            links: data.edges,
+          });
+          setProcessingFilesDone();
+        };
+
+        return cognifyDataset(dataset, onUpdate)
+          .then(() => {
+            refreshDatasets();
+          });
+      })
+      .catch((error) => {
+        console.error("Failed to process files:", error);
+        setProcessingFilesDone();
+        // Show error message to user
+        alert("Failed to process files. Please try again.");
+      });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                return addData(dataset, files)
          
                  .then(() => {
          
                    const onUpdate = (data: NodesAndEdges) => {
          
                      onData({
          
                        nodes: data.nodes,
          
                        links: data.edges,
          
                      });
          
                      setProcessingFilesDone();
          
                    };
          
                    return cognifyDataset(dataset, onUpdate)
          
                      .then(() => {
          
                        refreshDatasets();
          
                      });
          
                  });
          
                return addData(dataset, files)
          
                  .then(() => {
          
                    const onUpdate = (data: NodesAndEdges) => {
          
                      onData({
          
                        nodes: data.nodes,
          
                        links: data.edges,
          
                      });
          
                      setProcessingFilesDone();
          
                    };
          
                    return cognifyDataset(dataset, onUpdate)
          
                      .then(() => {
          
                        refreshDatasets();
          
                      });
          
                  })
          
                  .catch((error) => {
          
                    console.error("Failed to process files:", error);
          
                    setProcessingFilesDone();
          
                    // Show error message to user
          
                    alert("Failed to process files. Please try again.");
          
                  });

🤖 Prompt for AI Agents

In cognee-frontend/src/app/(graph)/CogneeAddWidget.tsx around lines 77 to 91,
the promise chain handling file upload and dataset processing lacks error
handling, risking the component staying in a processing state indefinitely. Add
catch blocks to handle errors from addData and cognifyDataset calls, ensuring
that on errors, the component state is updated appropriately (e.g., clearing
processing flags) and errors are logged or reported to inform the user or
developer.

cognee-frontend/src/app/(graph)/CogneeAddWidget.tsx

Comment on lines +67 to +75

    
                  if (!event.target.files) {

                    return;

                  }

                  const files: File[] = Array.from(event.target.files);

                  if (!files.length) {

                    return;

                  }

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Add file validation and type checking.

The file upload handling lacks validation for file types, sizes, and other constraints that could prevent processing issues.

     if (!event.target.files) {
+      setProcessingFilesDone();
       return;
     }

     const files: File[] = Array.from(event.target.files);

     if (!files.length) {
+      setProcessingFilesDone();
       return;
     }

+    // Validate file types and sizes
+    const maxFileSize = 10 * 1024 * 1024; // 10MB
+    const allowedTypes = ['text/plain', 'application/pdf', 'application/msword'];
+    
+    const invalidFiles = files.filter(file => 
+      file.size > maxFileSize || !allowedTypes.includes(file.type)
+    );
+    
+    if (invalidFiles.length > 0) {
+      alert(`Invalid files detected. Please check file types and sizes.`);
+      setProcessingFilesDone();
+      return;
+    }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                if (!event.target.files) {
          
                  return;
          
                }
          
                const files: File[] = Array.from(event.target.files);
          
                if (!files.length) {
          
                  return;
          
                }
          
                if (!event.target.files) {
          
                  setProcessingFilesDone();
          
                  return;
          
                }
          
                const files: File[] = Array.from(event.target.files);
          
                if (!files.length) {
          
                  setProcessingFilesDone();
          
                  return;
          
                }
          
                // Validate file types and sizes
          
                const maxFileSize = 10 * 1024 * 1024; // 10MB
          
                const allowedTypes = ['text/plain', 'application/pdf', 'application/msword'];
          
                const invalidFiles = files.filter(file =>
          
                  file.size > maxFileSize || !allowedTypes.includes(file.type)
          
                );
          
                if (invalidFiles.length > 0) {
          
                  alert(`Invalid files detected. Please check file types and sizes.`);
          
                  setProcessingFilesDone();
          
                  return;
          
                }

🤖 Prompt for AI Agents

In cognee-frontend/src/app/(graph)/CogneeAddWidget.tsx around lines 67 to 75,
the file upload handler currently does not validate the uploaded files. Add
checks to validate file types against allowed MIME types, verify file sizes do
not exceed defined limits, and handle any other relevant constraints before
processing the files. This will prevent invalid or problematic files from being
processed.

cognee-frontend/src/utils/fetch.ts

    
                    });

                }

                return global.fetch("http://localhost:8000/api" + url, {

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Make API base URL configurable.

The hard-coded http://localhost:8000/api base URL should be configurable through environment variables to support different deployment environments.

-  return global.fetch("http://localhost:8000/api" + url, {
+  return global.fetch(process.env.NEXT_PUBLIC_API_BASE_URL + url, {

And add to your environment configuration:

NEXT_PUBLIC_API_BASE_URL=http://localhost:8000/api

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
              return global.fetch("http://localhost:8000/api" + url, {
          
              return global.fetch(process.env.NEXT_PUBLIC_API_BASE_URL + url, {

🤖 Prompt for AI Agents

In cognee-frontend/src/utils/fetch.ts at line 25, replace the hard-coded base
URL "http://localhost:8000/api" with a reference to an environment variable such
as process.env.NEXT_PUBLIC_API_BASE_URL. This makes the API base URL
configurable for different environments. Ensure to add
NEXT_PUBLIC_API_BASE_URL=http://localhost:8000/api to your environment
configuration file for local development.

cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py

Comment on lines +82 to +114

    
                  async def query(self, query: str, params: Optional[dict] = None) -> List[Tuple]:

                      """Execute a Kuzu query via the REST API."""

                      try:

                          # Initialize schema if needed

                          if not self._schema_initialized:

                              await self._initialize_schema()

                          response = await self._make_request(

                              "/query", {"query": query, "parameters": params or {}}

                          )

                          # Convert response to list of tuples

                          results = []

                          if "data" in response:

                              for row in response["data"]:

                                  processed_row = []

                                  for val in row:

                                      if isinstance(val, dict) and "properties" in val:

                                          try:

                                              props = json.loads(val["properties"])

                                              val.update(props)

                                              del val["properties"]

                                          except json.JSONDecodeError:

                                              pass

                                      processed_row.append(val)

                                  results.append(tuple(processed_row))

                          return results

                      except Exception as e:

                          logger.error(f"Query execution failed: {str(e)}")

                          logger.error(f"Query: {query}")

                          logger.error(f"Parameters: {params}")

                          raise

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Refactor query method to reduce nested complexity.

The query method has too many nested blocks (6/5 limit). Consider extracting helper methods to improve readability.

Extract response processing logic:

 async def query(self, query: str, params: Optional[dict] = None) -> List[Tuple]:
     """Execute a Kuzu query via the REST API."""
     try:
         # Initialize schema if needed
         if not self._schema_initialized:
             await self._initialize_schema()

         response = await self._make_request(
             "/query", {"query": query, "parameters": params or {}}
         )

-        # Convert response to list of tuples
-        results = []
-        if "data" in response:
-            for row in response["data"]:
-                processed_row = []
-                for val in row:
-                    if isinstance(val, dict) and "properties" in val:
-                        try:
-                            props = json.loads(val["properties"])
-                            val.update(props)
-                            del val["properties"]
-                        except json.JSONDecodeError:
-                            pass
-                    processed_row.append(val)
-                results.append(tuple(processed_row))
-
-        return results
+        return self._process_response(response)
     except Exception as e:
         logger.error(f"Query execution failed: {str(e)}")
         logger.error(f"Query: {query}")
         logger.error(f"Parameters: {params}")
         raise

+def _process_response(self, response: dict) -> List[Tuple]:
+    """Process API response and convert to list of tuples."""
+    results = []
+    if "data" not in response:
+        return results
+        
+    for row in response["data"]:
+        processed_row = []
+        for val in row:
+            processed_val = self._process_value(val)
+            processed_row.append(processed_val)
+        results.append(tuple(processed_row))
+    return results
+
+def _process_value(self, val):
+    """Process individual value from response."""
+    if isinstance(val, dict) and "properties" in val:
+        try:
+            props = json.loads(val["properties"])
+            val.update(props)
+            del val["properties"]
+        except json.JSONDecodeError:
+            pass
+    return val

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                async def query(self, query: str, params: Optional[dict] = None) -> List[Tuple]:
          
                    """Execute a Kuzu query via the REST API."""
          
                    try:
          
                        # Initialize schema if needed
          
                        if not self._schema_initialized:
          
                            await self._initialize_schema()
          
                        response = await self._make_request(
          
                            "/query", {"query": query, "parameters": params or {}}
          
                        )
          
                        # Convert response to list of tuples
          
                        results = []
          
                        if "data" in response:
          
                            for row in response["data"]:
          
                                processed_row = []
          
                                for val in row:
          
                                    if isinstance(val, dict) and "properties" in val:
          
                                        try:
          
                                            props = json.loads(val["properties"])
          
                                            val.update(props)
          
                                            del val["properties"]
          
                                        except json.JSONDecodeError:
          
                                            pass
          
                                    processed_row.append(val)
          
                                results.append(tuple(processed_row))
          
                        return results
          
                    except Exception as e:
          
                        logger.error(f"Query execution failed: {str(e)}")
          
                        logger.error(f"Query: {query}")
          
                        logger.error(f"Parameters: {params}")
          
                        raise
          
                async def query(self, query: str, params: Optional[dict] = None) -> List[Tuple]:
          
                    """Execute a Kuzu query via the REST API."""
          
                    try:
          
                        # Initialize schema if needed
          
                        if not self._schema_initialized:
          
                            await self._initialize_schema()
          
                        response = await self._make_request(
          
                            "/query", {"query": query, "parameters": params or {}}
          
                        )
          
                        return self._process_response(response)
          
                    except Exception as e:
          
                        logger.error(f"Query execution failed: {str(e)}")
          
                        logger.error(f"Query: {query}")
          
                        logger.error(f"Parameters: {params}")
          
                        raise
          
                def _process_response(self, response: dict) -> List[Tuple]:
          
                    """Process API response and convert to list of tuples."""
          
                    results = []
          
                    if "data" not in response:
          
                        return results
          
                    for row in response["data"]:
          
                        processed_row = []
          
                        for val in row:
          
                            processed_val = self._process_value(val)
          
                            processed_row.append(processed_val)
          
                        results.append(tuple(processed_row))
          
                    return results
          
                def _process_value(self, val):
          
                    """Process individual value from response."""
          
                    if isinstance(val, dict) and "properties" in val:
          
                        try:
          
                            props = json.loads(val["properties"])
          
                            val.update(props)
          
                            del val["properties"]
          
                        except json.JSONDecodeError:
          
                            pass
          
                    return val

🧰 Tools

🪛 Pylint (3.3.7)

[refactor] 84-114: Too many nested blocks (6/5)

(R1702)

🤖 Prompt for AI Agents

In cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py around
lines 82 to 114, the query method is overly nested, making it hard to read.
Refactor by extracting the response processing logic that converts the response
data into a list of tuples into a separate helper method. This helper should
handle iterating over rows and processing each value, including JSON decoding of
properties. Then call this helper from the query method to reduce nesting and
improve clarity.

cognee/api/v1/add/routers/get_add_router.py

    
                      data: List[UploadFile],

                      datasetId: Optional[UUID] = Form(default=None),

                      datasetName: Optional[str] = Form(default=None),

                      datasetId: Optional[UUID] = Form(default=None),

Contributor

coderabbitai bot Jun 18, 2025

⚠️ Potential issue

Fix the Form() call in argument defaults.

The static analysis tool correctly identifies a potential issue where Form(default=None) is called at function definition time rather than at request time. This could lead to unexpected behavior.

Apply this diff to fix the issue:

-        datasetId: Optional[UUID] = Form(default=None),
+        datasetId: Optional[UUID] = Form(None),

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.11.9)

24-24: Do not perform function call Form in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

🤖 Prompt for AI Agents

In cognee/api/v1/add/routers/get_add_router.py at line 24, the argument default
value uses Form(default=None) which is evaluated at function definition time. To
fix this, remove the default argument from Form and instead use Form(None) so
that the default value is set at request time, preventing potential unexpected
behavior.

cognee-frontend/src/app/(graph)/CrewAITrigger.tsx

    
                    username2: formElements.username2.value,

                  };

                  const websocket = new WebSocket("ws://localhost:8000/api/v1/crewai/subscribe");

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Replace hardcoded localhost URL with configurable endpoint.

The hardcoded WebSocket URL will fail in production environments. Consider using environment variables or configuration for the WebSocket endpoint.

-    const websocket = new WebSocket("ws://localhost:8000/api/v1/crewai/subscribe");
+    const websocket = new WebSocket(`${process.env.NEXT_PUBLIC_WS_URL || 'ws://localhost:8000'}/api/v1/crewai/subscribe`);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                const websocket = new WebSocket("ws://localhost:8000/api/v1/crewai/subscribe");
          
                const websocket = new WebSocket(`${process.env.NEXT_PUBLIC_WS_URL || 'ws://localhost:8000'}/api/v1/crewai/subscribe`);

🤖 Prompt for AI Agents

In cognee-frontend/src/app/(graph)/CrewAITrigger.tsx at line 31, replace the
hardcoded WebSocket URL "ws://localhost:8000/api/v1/crewai/subscribe" with a
configurable endpoint by using an environment variable or configuration setting.
This ensures the WebSocket connection works correctly in different environments,
including production. Update the code to read the WebSocket URL from process.env
or a similar configuration source.

cognee-frontend/src/app/(graph)/GraphControls.tsx

Comment on lines +87 to +88

    
                    alert("Please fill in both name and value fields for the new property.");

                  }

Contributor

coderabbitai bot Jun 18, 2025

🛠️ Refactor suggestion

Replace alert() with proper error feedback.

Using alert() provides poor user experience and doesn't fit modern UI patterns. Consider using a toast notification or inline error message.

-    } else {
-      alert("Please fill in both name and value fields for the new property.");
-    }
+    } else {
+      // Add error state to show validation message
+      setPropertyError("Please fill in both name and value fields for the new property.");
+    }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In cognee-frontend/src/app/(graph)/GraphControls.tsx around lines 87 to 88,
replace the use of alert() for error feedback with a more user-friendly approach
such as a toast notification or an inline error message component. Remove the
alert() call and implement a state variable to track the error message, then
display this message in the UI near the input fields to provide clear and
non-intrusive feedback to the user.

cognee/api/v1/cognify/cognify.py

    
                  user,

                  datasets,

                  graph_db_config: dict = None,

                  vector_db_config: dict = False,

Contributor

coderabbitai bot Jun 18, 2025

⚠️ Potential issue

Fix inconsistent default parameter type (duplicate issue).

Same issue as above in the background process function.

-vector_db_config: dict = False,
+vector_db_config: dict = None,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                vector_db_config: dict = False,
          
                vector_db_config: dict = None,

🤖 Prompt for AI Agents

In cognee/api/v1/cognify/cognify.py at line 89, the default value for the
parameter vector_db_config is set to False, which is inconsistent with its
declared type dict. Change the default value to an empty dictionary {} to match
the expected type and maintain consistency with other similar functions.

borisarzentar closed this

coderabbitai bot reviewed

View reviewed changes

Contributor

coderabbitai bot left a comment

Review continued from previous batch...

cognee/api/v1/cognify/cognify.py

    
                  user,

                  datasets,

                  graph_db_config: dict = None,

                  vector_db_config: dict = False,

Contributor

coderabbitai bot Jun 18, 2025

⚠️ Potential issue

Fix inconsistent default parameter type.

The vector_db_config parameter has a default of False instead of None or {}, which is inconsistent with its dict type annotation.

-vector_db_config: dict = False,
+vector_db_config: dict = None,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

      
                vector_db_config: dict = False,
          
                vector_db_config: dict = None,

🤖 Prompt for AI Agents

In cognee/api/v1/cognify/cognify.py at line 67, the parameter vector_db_config
is annotated as a dict but has a default value of False, which is inconsistent.
Change the default value from False to None or an empty dictionary {} to match
the dict type annotation and avoid type errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet