-
Notifications
You must be signed in to change notification settings - Fork 958
Merge with Dev #1000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge with Dev #1000
Conversation
<!-- .github/pull_request_template.md --> ## Description Resolve issue with .venv being broken when using docker compose with Cognee ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Boris Arzentar <[email protected]>
… 1947 (#760) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Boris <[email protected]> Co-authored-by: Igor Ilic <[email protected]> Co-authored-by: Igor Ilic <[email protected]>
<!-- .github/pull_request_template.md --> ## Description Add support for UV and for Poetry package management ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description Switch typing from str to UUID for NetworkX node_id ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description Add both sse and stdio support for Cognee MCP ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
…83] (#782) <!-- .github/pull_request_template.md --> ## Description Add log handling options for cognee exceptions ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description Fix issue with failing versions gh actions ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Vasilije <[email protected]>
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. Co-authored-by: Vasilije <[email protected]>
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Hande <[email protected]> Co-authored-by: Vasilije <[email protected]>
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Hande <[email protected]> Co-authored-by: Vasilije <[email protected]>
<!-- .github/pull_request_template.md --> ## Description This PR adds support for the Memgraph graph database following the [graph database integration guide](https://docs.cognee.ai/contributing/adding-providers/graph-db/graph-database-integration): - Implemented `MemgraphAdapter` for interfacing with Memgraph. - Updated `get_graph_engine.py` to return MemgraphAdapter when appropriate. - Added a test script:` test_memgraph.py.` - Created a dedicated test workflow: `.github/workflows/test_memgraph.yml.` ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Vasilije <[email protected]> Co-authored-by: Boris <[email protected]>
<!-- .github/pull_request_template.md --> ## Description refactor: Handle boto3 s3fs dependencies better ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description Update LanceDB and rewrite data points to run async ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Boris <[email protected]> Co-authored-by: Boris Arzentar <[email protected]>
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description As discussed with @hande-k and Lazar, I've created a short demo to illustrate how to get the pagerank rankings from the knowledge graph given the nx engine. This is a POC, and a first of step towards solving #643 . Please let me know what you think, and how to proceed from here. :) ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Boris <[email protected]> Co-authored-by: Hande <[email protected]> Co-authored-by: Vasilije <[email protected]>
<!-- .github/pull_request_template.md --> ## Description Added tools to check current cognify and codify status ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description Fixes url typo in general adapters ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: hajdul88 <[email protected]> Co-authored-by: lxobr <[email protected]> Co-authored-by: Igor Ilic <[email protected]> Co-authored-by: Hande <[email protected]> Co-authored-by: Vasilije <[email protected]>
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Igor Ilic <[email protected]> Co-authored-by: Igor Ilic <[email protected]>
<!-- .github/pull_request_template.md --> ## Description Resolve issue with ollama and gemini ci cd ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
…s3fs (#978) <!-- .github/pull_request_template.md --> ## Description Makes s3 pathway imports optional so cognee can run without s3fs ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description Add test of MCP functionality and starting of MCP server, fix some MCP and LanceDB issues ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description Replaces Owlready2 with RDFLib ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Igor Ilic <[email protected]>
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description Remove cognee models from permissions migration ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description Add how to use postgres with Cognee docker compose ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Igor Ilic <[email protected]>
<!-- .github/pull_request_template.md --> ## Description Simple Cognee endpoint testing ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Boris <[email protected]>
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
Please make sure all the checkboxes are checked:
|
WalkthroughThis update introduces comprehensive backend access control and dataset-scoped database management across the Cognee platform. It implements dataset-specific ACLs, context-aware database configuration, and permission-aware API endpoints. The pipeline execution model is refactored to support streaming pipeline run info and background processing. New exception classes, authentication strategies, and utility modules are added, while several legacy or redundant files are removed or replaced. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant API
participant Auth
participant DB
participant Pipeline
participant Queue
User->>API: POST /v1/datasets/ (with name)
API->>DB: Check if dataset exists
alt Exists
DB-->>API: Return existing dataset
else Not exists
API->>DB: Create dataset, assign owner
DB-->>API: Return new dataset
end
API-->>User: Dataset info
User->>API: POST /v1/add/ (with dataset_id)
API->>DB: Ingest data to dataset
DB-->>API: Data added
API-->>User: Confirmation
User->>API: POST /v1/cognify/ (with dataset_ids, run_in_background)
API->>Pipeline: Start pipeline run (per dataset)
Pipeline->>Queue: Stream PipelineRunInfo events
API-->>User: PipelineRunInfo (initial)
User->>API: WS /v1/cognify/subscribe/{pipeline_run_id}
API->>Auth: Authenticate user via JWT
Auth-->>API: User info
loop Until completed
Queue-->>API: PipelineRunInfo event
API-->>User: Send event (activity, graph data, etc.)
end
User->>API: POST /v1/search/ (with dataset_ids)
API->>DB: Check user permissions for datasets
API->>DB: Run search per dataset (context-aware)
DB-->>API: Search results
API-->>User: Aggregated results
Possibly related PRs
Poem
✨ Finishing Touches
🧪 Generate Unit Tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 17350889 | Triggered | Generic Password | 4eb71cc | cognee/tests/test_remote_kuzu.py | View secret |
| 17116131 | Triggered | Generic Password | 3b07f3c | examples/database_examples/neo4j_example.py | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
| name: Test using different async databases in parallel in Cognee | ||
| runs-on: ubuntu-22.04 | ||
| steps: | ||
| - name: Check out repository | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Cognee Setup | ||
| uses: ./.github/actions/cognee_setup | ||
| with: | ||
| python-version: '3.11.x' | ||
|
|
||
| - name: Install specific graph db dependency | ||
| run: | | ||
| poetry install -E kuzu | ||
| - name: Run parallel databases test | ||
| env: | ||
| ENV: 'dev' | ||
| LLM_MODEL: ${{ secrets.LLM_MODEL }} | ||
| LLM_ENDPOINT: ${{ secrets.LLM_ENDPOINT }} | ||
| LLM_API_KEY: ${{ secrets.LLM_API_KEY }} | ||
| LLM_API_VERSION: ${{ secrets.LLM_API_VERSION }} | ||
| EMBEDDING_MODEL: ${{ secrets.EMBEDDING_MODEL }} | ||
| EMBEDDING_ENDPOINT: ${{ secrets.EMBEDDING_ENDPOINT }} | ||
| EMBEDDING_API_KEY: ${{ secrets.EMBEDDING_API_KEY }} | ||
| EMBEDDING_API_VERSION: ${{ secrets.EMBEDDING_API_VERSION }} | ||
| run: poetry run python ./cognee/tests/test_parallel_databases.py |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 6 months ago
To fix the issue, add a permissions block to the workflow file. This block should specify the least privileges required for the workflow to function correctly. Since the workflow does not perform any write operations on the repository, the contents: read permission is sufficient. This change should be applied at the root level of the workflow to cover all jobs unless specific jobs require additional permissions.
-
Copy modified lines R3-R5
| @@ -2,2 +2,5 @@ | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| on: |
| name: Run MCP Test | ||
| runs-on: ubuntu-22.04 | ||
| steps: | ||
| - name: Check out repository | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: ${{ inputs.python-version }} | ||
|
|
||
| - name: Install UV | ||
| shell: bash | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install uv | ||
| # This will install all dependencies along with Cognee version deployed on PIP | ||
| - name: Install dependencies | ||
| shell: bash | ||
| working-directory: cognee-mcp | ||
| run: uv sync | ||
|
|
||
| # NEW: swap in current local cognee branch version | ||
| - name: Override with cognee branch checkout | ||
| working-directory: cognee-mcp | ||
| run: | | ||
| # Remove Cognee wheel that came from PyPI | ||
| uv pip uninstall cognee | ||
| # Install of the freshly-checked-out Cognee branch | ||
| uv pip install --force-reinstall -e ../ | ||
| - name: Run MCP test | ||
| env: | ||
| ENV: 'dev' | ||
| LLM_MODEL: ${{ secrets.LLM_MODEL }} | ||
| LLM_ENDPOINT: ${{ secrets.LLM_ENDPOINT }} | ||
| LLM_API_KEY: ${{ secrets.LLM_API_KEY }} | ||
| LLM_API_VERSION: ${{ secrets.LLM_API_VERSION }} | ||
| EMBEDDING_MODEL: ${{ secrets.EMBEDDING_MODEL }} | ||
| EMBEDDING_ENDPOINT: ${{ secrets.EMBEDDING_ENDPOINT }} | ||
| EMBEDDING_API_KEY: ${{ secrets.EMBEDDING_API_KEY }} | ||
| EMBEDDING_API_VERSION: ${{ secrets.EMBEDDING_API_VERSION }} | ||
| working-directory: cognee-mcp | ||
| run: uv run --no-sync python ./src/test_client.py |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 6 months ago
To fix the issue, we need to add a permissions block to the workflow. This block should specify the least privileges required for the workflow to function correctly. Since the workflow primarily interacts with the repository's contents and does not perform write operations, we can set contents: read as the permission. This ensures the workflow has only read access to the repository's contents, reducing the risk of unintended modifications.
The permissions block should be added at the root level of the workflow to apply to all jobs. Alternatively, it can be added to the specific job (test-mcp) if different jobs require different permissions.
-
Copy modified lines R6-R8
| @@ -5,2 +5,5 @@ | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| jobs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 81
🔭 Outside diff range comments (2)
cognee/infrastructure/databases/vector/qdrant/QDrantAdapter.py (1)
402-413: Fix undefined variable issue in zero-limit search.The
collection_sizevariable is only defined whenlimit == 0, but it's used unconditionally on line 413, which will cause a NameError.client = self.get_qdrant_client() + search_limit = limit if limit == 0: collection_size = await client.count(collection_name=collection_name) + search_limit = collection_size.count results = await client.search( collection_name=collection_name, query_vector=models.NamedVector( name="text", vector=query_vector if query_vector is not None else (await self.embed_data([query_text]))[0], ), - limit=limit if limit > 0 else collection_size.count, + limit=search_limit, with_vectors=with_vector, )cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (1)
115-122: Verify the query method call should be awaited.The
has_nodemethod callsself.query()but doesn't await it, which will cause issues sincequeryis an async method.Apply this fix:
- results = self.query( + results = await self.query(
🧹 Nitpick comments (91)
CONTRIBUTING.md (1)
1-3: Use GitHub-compatible admonition syntax
The[!IMPORTANT]tag is an Azure DevOps/Markdown extension and won’t render on GitHub. To ensure it displays correctly, merge the two lines into a single blockquote with a bold prefix.- > [!IMPORTANT] - > **Note for contributors:** When branching out, create a new branch from the `dev` branch. + > **Important:** When branching out, create a new branch from the `dev` branch.cognee/modules/retrieval/utils/description_to_codepart_search.py (1)
2-2: Logging import placement and usage
Thesetup_loggingandERRORimports are only used in the__main__block. Consider moving these imports inside that block or removing them if the script entrypoint is rarely executed.cognee/modules/data/exceptions/__init__.py (1)
10-11: Re-exported exceptions trigger unused‐import warnings
These imports are intended to expose the new exception classes but raise F401. Add an explicit__all__ = ["UnstructuredLibraryImportError", "UnauthorizedDataAccessError", "DatasetNotFoundError", "DatasetTypeError"]or append# noqa: F401to suppress the lint errors.cognee/modules/pipelines/methods/__init__.py (1)
1-1: Re-export pattern missing__all__
Importingget_pipeline_runhere is for package‐level API but triggers unused‐import. Define__all__ = ["get_pipeline_run"]or add# noqa: F401to clarify its re‐export purpose.cognee/modules/users/authentication/default/__init__.py (1)
1-2: Clarify public API exports
Re‐exportingdefault_transportandDefaultJWTStrategywithout an__all__list causes unused‐import warnings. Add__all__ = ["default_transport", "DefaultJWTStrategy"]or use# noqa: F401to document intended exports.cognee/modules/users/models/__init__.py (1)
4-4: Exported model import unused by module
TheDatasetDatabaseimport is for namespacing but raises F401. Please add it to an__all__list or append# noqa: F401to indicate this is an intentional re‐export.cognee-frontend/src/ui/Partials/SettingsModal/SettingsModal.tsx (1)
1-10: Consider removing commented code entirely.The entire component has been commented out, which creates code clutter and maintenance overhead. If this component is permanently deprecated, consider removing it entirely rather than keeping commented code.
If this is temporary, consider adding a TODO comment explaining when/why it will be restored.
-// import { Modal } from 'ohmy-ui'; -// import Settings from './Settings'; - -// export default function SettingsModal({ isOpen = false, onClose = () => {} }) { -// return ( -// <Modal isOpen={isOpen} onClose={onClose}> -// <Settings onDone={onClose} /> -// </Modal> -// ) -// } +// TODO: SettingsModal temporarily disabled - remove if permanently deprecatedcognee-frontend/src/ui/App/Loading/DefaultLoadingIndicator/LoadingIndicator.module.css (1)
6-6: Consider maintaining themability with CSS variables.The change from
var(--global-color-primary)to hardcodedwhitereduces the component's themability. If the loading indicator needs to be white specifically, consider using a CSS variable likevar(--loading-indicator-color, white)to maintain flexibility.- border: 0.18rem solid white; + border: 0.18rem solid var(--loading-indicator-color, white);cognee-frontend/src/app/layout.tsx (1)
9-9: Consider revising description for production readiness.The description "Cognee Dev Mexican Standoff" appears to be a development placeholder. Consider updating to a more professional description that accurately describes the application's purpose before production deployment.
- description: "Cognee Dev Mexican Standoff", + description: "Cognee - AI-powered knowledge management platform",cognee-frontend/src/modules/datasets/getDatasetGraph.ts (1)
3-6: Consider adding error handling for robustness.The function implementation is clean and straightforward. However, consider adding error handling for network failures and invalid JSON responses to improve user experience.
Consider this enhanced version with error handling:
export default function getDatasetGraph(dataset: { id: string }) { return fetch(`/v1/datasets/${dataset.id}/graph`) - .then((response) => response.json()); + .then((response) => { + if (!response.ok) { + throw new Error(`Failed to fetch dataset graph: ${response.status}`); + } + return response.json(); + }) + .catch((error) => { + console.error('Error fetching dataset graph:', error); + throw error; + }); }cognee-frontend/src/modules/datasets/createDataset.ts (1)
3-12: Add error handling for create operations.The function implementation follows a consistent pattern with other API functions. However, adding error handling would improve robustness, especially for create operations where users need clear feedback on success or failure.
Consider this enhanced version with error handling:
export default function createDataset(dataset: { name: string }) { return fetch(`/v1/datasets/`, { method: "POST", body: JSON.stringify(dataset), headers: { "Content-Type": "application/json", } }) - .then((response) => response.json()); + .then((response) => { + if (!response.ok) { + throw new Error(`Failed to create dataset: ${response.status}`); + } + return response.json(); + }) + .catch((error) => { + console.error('Error creating dataset:', error); + throw error; + }); }cognee/modules/users/methods/get_user.py (1)
4-4: Remove unused import.The
sqlalchemy.excimport is unused and should be removed.-import sqlalchemy.exccognee/api/v1/delete/exceptions.py (1)
29-39: Fix the docstring description.The exception class implementation is correct, but the docstring has an incorrect description.
- """Raised when a dataset cannot be found.""" + """Raised when data cannot be found."""cognee-frontend/src/app/auth/signup/SignUpPage.tsx (1)
12-12: Accessibility: Add proper alt text and consider responsive image handlingThe logo alt text could be more descriptive for screen readers, and the fixed dimensions might not be responsive.
- <Image src="/images/cognee-logo-with-text.png" alt="Cognee logo" width={176} height={46} className="h-12 w-44 self-center mb-16" /> + <Image src="/images/cognee-logo-with-text.png" alt="Cognee application logo" width={176} height={46} className="h-12 w-44 self-center mb-16" priority />cognee-frontend/src/modules/ingestion/useDatasets.ts (1)
15-16: Consider using a more specific TypeScript type instead of disabling the lint ruleThe ESLint disable for
anytype could be avoided by using a more specific type for the timeout.- // eslint-disable-next-line @typescript-eslint/no-explicit-any - const statusTimeout = useRef<any>(null); + const statusTimeout = useRef<NodeJS.Timeout | null>(null);cognee/modules/users/methods/get_user_by_email.py (2)
16-16: Consider usingselectinloadinstead ofjoinedloadfor better performanceFor one-to-many relationships like
roles,selectinloadtypically performs better thanjoinedloadas it avoids cartesian products.- .options(joinedload(User.roles), joinedload(User.tenant)) + .options(selectinload(User.roles), joinedload(User.tenant))Add the import:
-from sqlalchemy.orm import joinedload +from sqlalchemy.orm import joinedload, selectinload
9-21: Consider adding error handling for database connection issuesWhile the current implementation is clean, consider wrapping the database operations in a try-catch block for production robustness.
async def get_user_by_email(user_email: str): db_engine = get_relational_engine() + try: async with db_engine.get_async_session() as session: user = ( await session.execute( select(User) .options(joinedload(User.roles), joinedload(User.tenant)) .where(User.email == user_email) ) ).scalar() return user + except Exception as e: + logger.error(f"Error retrieving user by email {user_email}: {e}") + raisecognee-frontend/src/ui/elements/index.ts (1)
1-8: Consider adding JSDoc comments for better developer experienceAdding brief descriptions of each component could improve the developer experience when using these exports.
+/** + * UI Elements - Core component library + */ + +/** Modal dialog component */ export { default as Modal } from "./Modal"; +/** Form input component */ export { default as Input } from "./Input"; +/** Dropdown select component */ export { default as Select } from "./Select"; +/** Multi-line text input component */ export { default as TextArea } from "./TextArea"; +/** Primary call-to-action button */ export { default as CTAButton } from "./CTAButton"; +/** Secondary ghost button */ export { default as GhostButton } from "./GhostButton"; +/** Neutral styled button */ export { default as NeutralButton } from "./NeutralButton"; +/** Status indicator component */ export { default as StatusIndicator } from "./StatusIndicator";cognee-frontend/src/app/auth/token/route.ts (1)
4-4: Consider removing unused parameter instead of disabling eslint.The
requestparameter is marked as unused with an eslint disable comment. If it's truly not needed, consider removing it from the function signature for cleaner code.-// eslint-disable-next-line @typescript-eslint/no-unused-vars -export async function GET(request: Request) { +export async function GET() {cognee-frontend/src/app/(graph)/getColorForNodeType.ts (1)
20-22: Consider improving type safety.The current implementation uses
keyof typeof NODE_COLORSwhich provides some type safety, but you could enhance it further by defining a union type for valid node types.+type NodeType = keyof typeof NODE_COLORS; -export default function getColorForNodeType(type: string) { - return NODE_COLORS[type as keyof typeof NODE_COLORS] || colors.gray[500]; +export default function getColorForNodeType(type: string): string { + return NODE_COLORS[type as NodeType] || formatHex(colors.gray[500]); }Note: Also consider using
formatHex()for the default color to maintain consistency.cognee-frontend/src/app/(graph)/GraphLegend.tsx (1)
11-11: Document the 100-node limit rationale.The arbitrary limit of 100 nodes for legend generation isn't explained. Consider documenting why this limit exists or making it configurable.
Add a comment explaining the rationale:
+ // Limit to first 100 nodes to prevent performance issues with large datasets for (let i = 0; i < Math.min(data?.length || 0, 100); i++) {cognee/modules/data/methods/get_dataset_ids.py (1)
9-20: Clean up function documentation.The docstring mentions a
pipeline_nameparameter that doesn't exist in the function signature. This should be removed for clarity.Apply this diff to clean up the documentation:
""" Function returns dataset IDs necessary based on provided input. It transforms raw strings into real dataset_ids with keeping write permissions in mind. If a user wants to write to a dataset he is not the owner of it must be provided through UUID. Args: datasets: - pipeline_name: user: Returns: a list of write access dataset_ids if they exist """cognee-frontend/src/ui/elements/Modal.tsx (1)
6-12: Consider adding accessibility and UX improvements.The modal lacks essential accessibility features and user experience enhancements:
- ARIA attributes for screen readers
- Focus management and focus trap
- Escape key to close
- Click outside to close functionality
- Prevent body scrolling when modal is open
Consider enhancing the modal with these features:
+import React, { useEffect, useRef } from 'react'; + interface ModalProps { isOpen: boolean; children: React.ReactNode; + onClose?: () => void; } -export default function Modal({ isOpen, children }: ModalProps) { +export default function Modal({ isOpen, children, onClose }: ModalProps) { + const modalRef = useRef<HTMLDivElement>(null); + + useEffect(() => { + const handleEscape = (event: KeyboardEvent) => { + if (event.key === 'Escape' && onClose) { + onClose(); + } + }; + + if (isOpen) { + document.addEventListener('keydown', handleEscape); + document.body.style.overflow = 'hidden'; + } + + return () => { + document.removeEventListener('keydown', handleEscape); + document.body.style.overflow = 'unset'; + }; + }, [isOpen, onClose]); + + const handleBackdropClick = (event: React.MouseEvent) => { + if (event.target === modalRef.current && onClose) { + onClose(); + } + }; + return isOpen && ( - <div className="fixed top-0 left-0 right-0 bottom-0 backdrop-blur-lg z-50 flex items-center justify-center"> + <div + ref={modalRef} + className="fixed top-0 left-0 right-0 bottom-0 backdrop-blur-lg z-50 flex items-center justify-center" + onClick={handleBackdropClick} + role="dialog" + aria-modal="true" + > {children} </div> ); }cognee/infrastructure/databases/graph/kuzu/show_remote_kuzu_stats.py (1)
6-6: Add documentation for the script purpose.Consider adding a docstring to explain what this script does and when to use it.
+""" +Script to display statistics from a remote Kuzu graph database. +Shows node count, edge count, and sample data for debugging/monitoring purposes. +""" + async def main():cognee-frontend/src/ui/elements/StatusIndicator.tsx (1)
2-7: Extract color mapping to constants for better maintainability.Hardcoded color values make the component harder to maintain and test. Consider extracting these to constants.
+const STATUS_COLORS = { + DATASET_PROCESSING_STARTED: "#ffd500", + DATASET_PROCESSING_INITIATED: "#ffd500", + DATASET_PROCESSING_COMPLETED: "#53ff24", + DATASET_PROCESSING_ERRORED: "#ff5024", +} as const; + export default function StatusIndicator({ status }: { status: DatasetStatus }) { - const statusColor = { - DATASET_PROCESSING_STARTED: "#ffd500", - DATASET_PROCESSING_INITIATED: "#ffd500", - DATASET_PROCESSING_COMPLETED: "#53ff24", - DATASET_PROCESSING_ERRORED: "#ff5024", - };cognee/modules/users/authentication/get_api_auth_backend.py (1)
1-11: Organize imports for better readability.Consider grouping related imports together and separating third-party from local imports.
import os from functools import lru_cache + from fastapi_users import models - from fastapi_users.authentication import ( JWTStrategy, AuthenticationBackend, ) from .api_bearer import api_bearer_transport, APIJWTStrategycognee/modules/users/authentication/get_client_auth_backend.py (2)
18-18: Move import to module level for better performance.Importing inside the function can impact performance, especially with caching. Consider moving this to the top of the file.
+from .default.default_jwt_strategy import DefaultJWTStrategy + @lru_cache def get_client_auth_backend(): transport = default_transport def get_jwt_strategy() -> JWTStrategy[models.UP, models.ID]: - from .default.default_jwt_strategy import DefaultJWTStrategy
13-30: Consider refactoring to reduce code duplication.The structure is very similar to
get_api_auth_backend.py. Consider creating a shared helper function to reduce duplication.+def _create_auth_backend(transport, strategy_class, lifetime_seconds): + def get_jwt_strategy() -> JWTStrategy[models.UP, models.ID]: + secret = os.getenv("FASTAPI_USERS_JWT_SECRET") + if not secret: + raise ValueError("FASTAPI_USERS_JWT_SECRET environment variable is required") + return strategy_class(secret, lifetime_seconds=lifetime_seconds) + + return AuthenticationBackend( + name=transport.name, + transport=transport, + get_strategy=get_jwt_strategy, + ) + @lru_cache def get_client_auth_backend(): - transport = default_transport - - def get_jwt_strategy() -> JWTStrategy[models.UP, models.ID]: - secret = os.getenv("FASTAPI_USERS_JWT_SECRET", "super_secret") - return DefaultJWTStrategy(secret, lifetime_seconds=3600) - - auth_backend = AuthenticationBackend( - name=transport.name, - transport=transport, - get_strategy=get_jwt_strategy, - ) - - return auth_backend + return _create_auth_backend(default_transport, DefaultJWTStrategy, 3600)cognee-frontend/src/ui/elements/CTAButton.tsx (1)
6-6: Improve readability by breaking up the long className string.The long className string makes the code harder to read and maintain. Consider breaking it into multiple lines or extracting it to a variable.
- <button className={classNames("flex flex-row justify-center items-center gap-2 cursor-pointer rounded-3xl bg-indigo-600 px-4 py-3 text-white hover:bg-indigo-500 focus-visible:outline-2 focus-visible:outline-offset-2 focus-visible:outline-indigo-600", className)} {...props}>{children}</button> + <button + className={classNames( + "flex flex-row justify-center items-center gap-2 cursor-pointer rounded-3xl", + "bg-indigo-600 px-4 py-3 text-white hover:bg-indigo-500", + "focus-visible:outline-2 focus-visible:outline-offset-2 focus-visible:outline-indigo-600", + className + )} + {...props} + > + {children} + </button>cognee-frontend/src/ui/Partials/FeedbackForm.tsx (1)
51-67: Consider adding client-side validation.The form currently lacks client-side validation for the feedback field. Consider adding basic validation to improve user experience.
<div className="mb-4"> <label className="block text-white" htmlFor="feedback">Feedback on agent's reasoning</label> - <TextArea id="feedback" name="feedback" type="text" placeholder="Your feedback" /> + <TextArea + id="feedback" + name="feedback" + type="text" + placeholder="Your feedback" + required + minLength={10} + /> </div>cognee/eval_framework/evaluation/deep_eval_adapter.py (1)
27-52: Robust retry implementation with minor suggestions for improvement.The retry logic is well-implemented with exponential backoff and proper logging. Consider these enhancements:
- Specify exception types: Catching all exceptions might mask programming errors
- Consider shorter maximum delay: The current exponential backoff can result in 16-second delays on the final retry
- except Exception as e: + except (ConnectionError, TimeoutError, RuntimeError) as e:Also consider adding a maximum delay cap:
- time.sleep(2**attempt) # Exponential backoff + time.sleep(min(2**attempt, 8)) # Cap at 8 secondscognee-frontend/src/app/auth/AuthPage.tsx (1)
8-9: Clean conversion to async server component with potential error handling consideration.The conversion to an async server component is well-executed. Consider adding error handling for the session retrieval:
export default async function AuthPage() { + let session; + try { - const session = await auth0.getSession(); + session = await auth0.getSession(); + } catch (error) { + console.error('Failed to get session:', error); + session = null; + }cognee/infrastructure/databases/utils/get_or_create_dataset_database.py (1)
33-36: Consider validating dataset_id before using it for file naming.The function generates database file names directly from the dataset_id without validation. Consider adding validation to ensure the dataset_id doesn't contain filesystem-unsafe characters.
dataset_id = await get_unique_dataset_id(dataset, user) + +# Validate dataset_id for filesystem safety +if not dataset_id or any(char in str(dataset_id) for char in ['/', '\\', '..', '\0']): + raise ValueError(f"Invalid dataset_id for filesystem usage: {dataset_id}") vector_db_name = f"{dataset_id}.lance.db" graph_db_name = f"{dataset_id}.pkl"cognee-frontend/src/app/(graph)/ActivityLog.tsx (1)
46-46: Fix inconsistent Tailwind CSS class usage.The class
flex-1/3andflex-2/3are not standard Tailwind classes. Use proper flex basis classes.-<span className="flex-1/3 text-xs text-gray-300 whitespace-nowrap mt-1.5">{formatter.format(activity.timestamp)}: </span> -<span className="flex-2/3 text-white whitespace-normal">{activity.activity}</span> +<span className="flex-none w-1/3 text-xs text-gray-300 whitespace-nowrap mt-1.5">{formatter.format(activity.timestamp)}: </span> +<span className="flex-1 text-white whitespace-normal">{activity.activity}</span>cognee/infrastructure/databases/vector/pgvector/create_db_and_tables.py (1)
3-3: Remove unused import.The static analysis correctly identifies that
context_vector_db_configis imported but never used.-from cognee.context_global_variables import vector_db_config as context_vector_db_configcognee/infrastructure/databases/graph/memgraph/memgraph_adapter.py (1)
638-644: Consider adding parameter validation for node_id.For consistency with other methods in the class, consider adding validation for the
node_idparameter to handle edge cases.async def get_node(self, node_id: str) -> Optional[Dict[str, Any]]: """Get a single node by ID.""" + if not node_id: + return None + query = """ MATCH (node {id: $node_id}) RETURN node """ results = await self.query(query, {"node_id": node_id}) return results[0]["node"] if results else Nonecognee/modules/ingestion/classify.py (2)
20-21: Refactor: Merge isinstance calls for better readability.The static analysis suggestion is valid - merge the isinstance calls for cleaner code.
- if isinstance(data, BufferedReader) or isinstance(data, SpooledTemporaryFile): + if isinstance(data, (BufferedReader, SpooledTemporaryFile)):
38-40: Improve error message specificity.The error message could be more helpful by distinguishing between unsupported data types and missing s3fs installation.
- raise IngestionError( - message=f"Type of data sent to classify(data: Union[str, BinaryIO) not supported or s3fs is not installed: {type(data)}" - ) + if S3File is None: + raise IngestionError( + message=f"Unsupported data type {type(data)} or s3fs package not installed. Install s3fs for S3 file support." + ) + else: + raise IngestionError( + message=f"Unsupported data type: {type(data)}. Supported types: str, BinaryIO, S3File" + )cognee/modules/metrics/operations/get_pipeline_run_metrics.py (1)
26-26: Consider updating function signature and name for clarity.The function now processes a single
PipelineRunInfobut still returns a list. Consider either:
- Returning a single
GraphMetricsobject (or None) instead of a list- Renaming the function to
get_pipeline_run_metric(singular)This would better reflect the current behavior and reduce confusion.
-async def get_pipeline_run_metrics(pipeline_run: PipelineRunInfo, include_optional: bool): +async def get_pipeline_run_metric(pipeline_run: PipelineRunInfo, include_optional: bool) -> GraphMetrics | None:And update the return logic:
- metrics_for_pipeline_runs = [] - # ... existing logic ... - if existing_metrics: - metrics_for_pipeline_runs.append(existing_metrics) - else: - # ... create metrics ... - metrics_for_pipeline_runs.append(metrics) - return metrics_for_pipeline_runs + # ... existing logic ... + if existing_metrics: + return existing_metrics + else: + # ... create metrics ... + session.add(metrics) + await session.commit() + return metricscognee-frontend/src/ui/elements/Select.tsx (2)
19-21: Consider accessibility improvements for the dropdown icon.The icon implementation looks good, but consider adding accessibility attributes for better screen reader support.
- <span className="pointer-events-none absolute top-1/2 -mt-0.5 right-3 text-indigo-600 rotate-180"> + <span + className="pointer-events-none absolute top-1/2 -mt-0.5 right-3 text-indigo-600 rotate-180" + aria-hidden="true" + >
8-14: Consider extracting base styles to a constant.The Tailwind classes are quite long. Consider extracting them to improve maintainability.
+const DEFAULT_SELECT_CLASSES = "block w-full appearance-none rounded-md bg-white pl-4 pr-8 py-4 text-base text-gray-900 outline-1 -outline-offset-1 outline-gray-300 focus:outline-2 focus:-outline-offset-2 focus:outline-indigo-600"; + export default function Select({ children, className, ...props }: SelectHTMLAttributes<HTMLSelectElement>) { return ( <div className="relative"> <select className={ classNames( - "block w-full appearance-none rounded-md bg-white pl-4 pr-8 py-4 text-base text-gray-900 outline-1 -outline-offset-1 outline-gray-300 focus:outline-2 focus:-outline-offset-2 focus:outline-indigo-600", + DEFAULT_SELECT_CLASSES, className, ) }cognee/modules/graph/methods/get_formatted_graph_data.py (3)
13-30: Consider refactoring lambda to named function for readability.The lambda function for node transformation is complex and could benefit from being extracted to a separate function for better readability and testability.
+def format_node(node): + """Format a graph node for frontend consumption.""" + node_id, node_data = node + return { + "id": str(node_id), + "label": node_data["name"] if ("name" in node_data and node_data["name"] != "") else f"{node_data['type']}_{str(node_id)}", + "type": node_data["type"], + "properties": { + key: value + for key, value in node_data.items() + if key not in ["id", "type", "name", "created_at", "updated_at"] + and value is not None + }, + } + return { - "nodes": list( - map( - lambda node: { - "id": str(node[0]), - "label": node[1]["name"] - if ("name" in node[1] and node[1]["name"] != "") - else f"{node[1]['type']}_{str(node[0])}", - "type": node[1]["type"], - "properties": { - key: value - for key, value in node[1].items() - if key not in ["id", "type", "name", "created_at", "updated_at"] - and value is not None - }, - }, - nodes, - ) - ), + "nodes": [format_node(node) for node in nodes],
31-40: Consider extracting edge formatting function as well.For consistency, consider extracting the edge formatting logic to a separate function.
+def format_edge(edge): + """Format a graph edge for frontend consumption.""" + return { + "source": str(edge[0]), + "target": str(edge[1]), + "label": edge[2], + } + - "edges": list( - map( - lambda edge: { - "source": str(edge[0]), - "target": str(edge[1]), - "label": edge[2], - }, - edges, - ) - ), + "edges": [format_edge(edge) for edge in edges],
24-24: Consider making excluded properties configurable.The hardcoded list of excluded properties might need to be configurable for different use cases.
+EXCLUDED_PROPERTIES = {"id", "type", "name", "created_at", "updated_at"} + if key not in ["id", "type", "name", "created_at", "updated_at"] + if key not in EXCLUDED_PROPERTIEScognee/modules/data/methods/load_or_create_datasets.py (2)
22-24: Consider the static analysis suggestion for comparison optimization.The static analysis suggestion to use
identifier in (ds.name, ds.id)could improve readability, but the current approach is clearer about what's being compared. The performance difference is negligible for typical dataset counts.The pylint suggestion is valid but optional here:
- match = next( - (ds for ds in existing_datasets if ds.name == identifier or ds.id == identifier), None - ) + match = next( + (ds for ds in existing_datasets if identifier in (ds.name, ds.id)), None + )Both approaches are acceptable - the current one is slightly more readable.
19-41: Consider optimization: Early return for performance.For better performance with large existing_datasets lists, consider using a dictionary lookup instead of linear search.
async def load_or_create_datasets( dataset_names: List[Union[str, UUID]], existing_datasets: List[Dataset], user ) -> List[Dataset]: """ Given a list of dataset identifiers (names or UUIDs), return Dataset instances: - If an identifier matches an existing Dataset (by name or id), reuse it. - Otherwise, create a new Dataset with a unique id. Note: Created dataset is not stored to database. """ + # Create lookup dictionaries for O(1) access + datasets_by_name = {ds.name: ds for ds in existing_datasets} + datasets_by_id = {ds.id: ds for ds in existing_datasets} + result: List[Dataset] = [] for identifier in dataset_names: - # Try to find a matching dataset in the existing list - # If no matching dataset is found return None - match = next( - (ds for ds in existing_datasets if ds.name == identifier or ds.id == identifier), None - ) + # Try to find a matching dataset using dictionary lookup + match = datasets_by_name.get(identifier) or datasets_by_id.get(identifier) if match: result.append(match) continuecognee/modules/data/methods/get_authorized_existing_datasets.py (1)
14-24: Complete the function docstring and add parameter validation.The docstring is missing the
permission_typeparameter description, and there's no validation for the permission type parameter.async def get_authorized_existing_datasets( datasets: Union[list[str], list[UUID]], permission_type: str, user: User ) -> list[Dataset]: """ Function returns a list of existing dataset objects user has access for based on datasets input. Args: datasets: List of dataset names or UUIDs to filter by + permission_type: Type of permission to check for (e.g., "read", "write") user: User object to check permissions for Returns: list of Dataset objects """ + if not permission_type: + raise ValueError("permission_type cannot be empty")cognee-frontend/src/modules/datasets/cognifyDataset.ts (1)
34-57: Remove commented code or document its purpose.The large block of commented WebSocket code should either be removed if not needed or properly documented with TODO comments explaining the planned implementation.
- // const websocket = new WebSocket(`ws://localhost:8000/api/v1/cognify/subscribe/${data.pipeline_run_id}`); - - // let isCognifyDone = false; - - // websocket.onmessage = (event) => { - // const data = JSON.parse(event.data); - // onUpdate?.({ - // nodes: data.payload.nodes, - // edges: data.payload.edges, - // }); - - // if (data.status === "PipelineRunCompleted") { - // isCognifyDone = true; - // websocket.close(); - // } - // }; - - // return new Promise(async (resolve) => { - // while (!isCognifyDone) { - // await new Promise(resolve => setTimeout(resolve, 1000)); - // } - - // resolve(true); - // }); + // TODO: Implement WebSocket subscription for real-time updates + // This will replace the current fetch-then-get pattern with streaming updatescognee-frontend/src/app/(graph)/CogneeAddWidget.tsx (2)
103-103: Avoid hardcoded dataset name.The hardcoded "main_dataset" name should be configurable or generated dynamically.
- createDataset({ name: "main_dataset" }) + createDataset({ name: `dataset_${Date.now()}` })Or better yet, allow the user to specify the dataset name through UI input.
31-158: Consider breaking down component responsibilities.This component handles too many concerns (dataset management, file uploads, modal state, search functionality). Consider extracting smaller components for better maintainability.
Consider extracting the following sub-components:
DatasetListfor rendering the dataset listFileUploadButtonfor file upload handlingSearchModalfor the search functionalityThis would improve testability and code reusability while following the Single Responsibility Principle.
cognee-frontend/src/utils/fetch.ts (1)
19-22: Consider using consistent fetch API and add timeout.Using
window.fetchfor token refresh andglobal.fetchfor API calls is inconsistent and could cause confusion. Also consider adding timeout protection.- return window.fetch("/auth/token") + return global.fetch("/auth/token", { credentials: "include" }) .then(() => { return fetch(url, options, retryCount + 1); });cognee-frontend/src/modules/chat/hooks/useChat.ts (2)
102-102: Improve type safety by replacing any types.Using
anytypes reduces type safety and makes the code harder to maintain. Consider creating proper interfaces for the system message types.-// eslint-disable-next-line @typescript-eslint/no-explicit-any -function convertToSearchTypeOutput(systemMessage: any[] | any, searchType: string): string { +interface SystemMessage { + text?: string; + [key: string]: unknown; +} + +function convertToSearchTypeOutput(systemMessage: SystemMessage[] | SystemMessage | string[], searchType: string): string {
48-80: Consider implementing request cancellation for concurrent operations.The current implementation doesn't handle the case where a user sends multiple messages quickly. Previous requests should ideally be cancelled to prevent race conditions and outdated responses.
You could use AbortController to cancel previous requests:
+ const [abortController, setAbortController] = useState<AbortController | null>(null); const handleMessageSending = useCallback((message: string, searchType: string) => { + // Cancel previous request if still pending + if (abortController) { + abortController.abort(); + } + + const newAbortController = new AbortController(); + setAbortController(newAbortController); + const sentMessageId = v4(); // ... rest of the logic - return sendMessage(message, searchType) + return sendMessage(message, searchType, { signal: newAbortController.signal }) .finally(() => { + setAbortController(null); enableSearchRun(); });cognee-frontend/src/app/auth/AuthForm.tsx (2)
13-16: Expand error mapping for better user experience.Consider adding more comprehensive error handling to cover additional authentication scenarios.
const errorsMap = { LOGIN_BAD_CREDENTIALS: "Invalid username or password", REGISTER_USER_ALREADY_EXISTS: "User already exists", + NETWORK_ERROR: "Unable to connect to server. Please check your internet connection.", + SERVER_ERROR: "Server error occurred. Please try again later.", + VALIDATION_ERROR: "Please check your input and try again.", };
8-11: Strengthen TypeScript interface for better type safety.The current interface extends HTMLFormElement but could be more specific about the expected form structure.
-interface AuthFormPayload extends HTMLFormElement { - email: HTMLInputElement; - password: HTMLInputElement; -} +interface AuthFormData { + email: string; + password: string; +} + +interface AuthFormPayload extends HTMLFormElement { + readonly elements: HTMLFormControlsCollection & { + email: HTMLInputElement; + password: HTMLInputElement; + }; +}cognee/modules/users/get_user_manager.py (1)
23-36: Remove commented code or document why it's preserved.Large blocks of commented code should either be removed or have clear documentation explaining why they're being preserved.
If this code is no longer needed, remove it entirely:
- # async def get(self, id: models.ID) -> models.UP: - # """ - # Get a user by id. - - # :param id: Id. of the user to retrieve. - # :raises UserNotExists: The user does not exist. - # :return: A user. - # """ - # user = await get_user(id) - - # if user is None: - # raise UserNotExists() - - # return userIf it needs to be preserved for future use, add a comment explaining why.
cognee/modules/pipelines/queues/pipeline_run_info_queues.py (1)
22-23: Add error handling for missing queue removal.The
remove_queuefunction will raise aKeyErrorif the pipeline run ID doesn't exist in the dictionary.def remove_queue(pipeline_run_id: UUID): - pipeline_run_info_queues.pop(str(pipeline_run_id)) + pipeline_run_info_queues.pop(str(pipeline_run_id), None)cognee-frontend/src/ui/elements/TextArea.tsx (1)
23-30: Optimize callback dependencies to prevent unnecessary re-renders.The
handleTextChangecallback includesvaluein dependencies, which could cause unnecessary re-renders when the value changes frequently.const handleTextChange = useCallback((event: Event) => { const fakeTextAreaElement = event.target as HTMLDivElement; const newValue = fakeTextAreaElement.innerText; - if (newValue !== value) { onChange?.(newValue); - } -}, [onChange, value]); +}, [onChange]);cognee-mcp/src/test_client.py (2)
123-127: Simplify nested context managers.The nested
withstatements can be combined into a single statement for better readability.- async with stdio_client(server_params) as (read, write): - async with ClientSession(read, write) as session: + async with stdio_client(server_params) as (read, write), \ + ClientSession(read, write) as session: # Initialize the session await session.initialize() yield session
308-326: Review search test logic for potential issues.The search test breaks out of the loop when encountering
NATURAL_LANGUAGEorCYPHERsearch types, but this might skip valid search types that come after them in the enum.# Go through all Cognee search types for search_type in SearchType: # Don't test these search types if search_type in [SearchType.NATURAL_LANGUAGE, SearchType.CYPHER]: - break + continuecognee-frontend/src/ui/Partials/SignInForm/SignInForm.tsx (1)
51-71: Consider improving form accessibility.The form could benefit from additional accessibility features like ARIA attributes and better error association.
- <form onSubmit={signIn} className="flex flex-col gap-2"> + <form onSubmit={signIn} className="flex flex-col gap-2" role="form" aria-label="Sign in form"> <div className="flex flex-col gap-2"> <div className="mb-4"> <label className="block mb-2" htmlFor="email">Email</label> - <Input id="email" name="email" type="email" placeholder="Your email address" /> + <Input + id="email" + name="email" + type="email" + placeholder="Your email address" + required + aria-describedby={signInError ? "signin-error" : undefined} + /> </div> <div className="mb-4"> <label className="block mb-2" htmlFor="password">Password</label> - <Input id="password" name="password" type="password" placeholder="Your password" /> + <Input + id="password" + name="password" + type="password" + placeholder="Your password" + required + aria-describedby={signInError ? "signin-error" : undefined} + /> </div> </div> <CTAButton type="submit"> {submitButtonText} {isSigningIn && <LoadingIndicator />} </CTAButton> {signInError && ( - <span className="text-s text-white">{signInError}</span> + <span id="signin-error" className="text-s text-white" role="alert" aria-live="polite"> + {signInError} + </span> )} </form>cognee/api/v1/search/search.py (2)
24-25: Merge isinstance calls for better performance.The multiple
isinstancecalls can be combined into a single call for better readability and performance.- if isinstance(datasets, UUID) or isinstance(datasets, str): + if isinstance(datasets, (UUID, str)): datasets = [datasets]
12-22: Consider reducing function parameters using a configuration object.The function has 9 parameters, which exceeds recommended limits and makes it harder to maintain. Consider using a search configuration object.
+from dataclasses import dataclass +from typing import Union, Optional, List, Type + +@dataclass +class SearchConfig: + query_text: str + query_type: SearchType = SearchType.GRAPH_COMPLETION + user: User = None + datasets: Optional[Union[list[str], str]] = None + dataset_ids: Optional[Union[list[UUID], UUID]] = None + system_prompt_path: str = "answer_simple_question.txt" + top_k: int = 10 + node_type: Optional[Type] = None + node_name: Optional[List[str]] = None -async def search( - query_text: str, - query_type: SearchType = SearchType.GRAPH_COMPLETION, - user: User = None, - datasets: Optional[Union[list[str], str]] = None, - dataset_ids: Optional[Union[list[UUID], UUID]] = None, - system_prompt_path: str = "answer_simple_question.txt", - top_k: int = 10, - node_type: Optional[Type] = None, - node_name: Optional[List[str]] = None, -) -> list: +async def search(config: SearchConfig) -> list: + # Extract parameters from config + query_text = config.query_text + query_type = config.query_type + user = config.user + datasets = config.datasets + dataset_ids = config.dataset_ids + # ... etccognee/modules/pipelines/models/PipelineRunInfo.py (2)
16-18: Consider using class-level default for status field.The hardcoded status string works but could be improved for consistency and maintainability.
Apply this pattern to make the status field more maintainable:
class PipelineRunStarted(PipelineRunInfo): - status: str = "PipelineRunStarted" - pass + status: str = "PipelineRunStarted"The
passstatement is unnecessary when you have field definitions.
21-33: Apply consistent pattern to remaining subclasses.The same improvements apply to all subclasses - remove unnecessary
passstatements since they have field definitions.class PipelineRunYield(PipelineRunInfo): status: str = "PipelineRunYield" - pass class PipelineRunCompleted(PipelineRunInfo): status: str = "PipelineRunCompleted" - pass class PipelineRunErrored(PipelineRunInfo): status: str = "PipelineRunErrored" - passNote: The pylint warnings about "too few public methods" are false positives - data models typically don't need multiple public methods.
cognee/infrastructure/databases/vector/qdrant/QDrantAdapter.py (1)
156-159: Simplify conditional structure.The
elifafterreturnis unnecessary and can be simplified.if self.qdrant_path is not None: return AsyncQdrantClient(path=self.qdrant_path, port=6333, https=is_prod) - elif self.url is not None: + if self.url is not None: return AsyncQdrantClient(url=self.url, api_key=self.api_key, port=6333, https=is_prod)cognee-frontend/src/app/(graph)/GraphView.tsx (1)
30-32: Unused state variable.The
isAddNodeFormOpenvalue is extracted but never used elsewhere in the component, except being passed toGraphControls. Consider if this is needed.If this state is only needed for passing to child components, the current implementation is fine. Otherwise, consider removing the unused destructuring.
cognee-starter-kit/README.md (1)
84-84: Convert bare URL to proper markdown link.-- create an account and API key from https://www.graphistry.com +- create an account and API key from [Graphistry](https://www.graphistry.com)cognee/context_global_variables.py (1)
35-35: Simplify boolean condition check.The current condition is unnecessarily complex and can be simplified for better readability.
- if not os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() == "true": + if os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() != "true":cognee-frontend/src/app/(graph)/GraphVisualization.tsx (3)
9-9: Fix typo in interface name.There's a typo in the interface name that should be corrected for consistency.
-interface GraphVisuzaliationProps { +interface GraphVisualizationProps {Also update the usage on line 20:
-export default function GraphVisualization({ ref, data, graphControls }: GraphVisuzaliationProps) { +export default function GraphVisualization({ ref, data, graphControls }: GraphVisualizationProps) {
23-56: Consider removing or documenting commented code.There's a substantial amount of commented-out code for node addition functionality. If this is planned future functionality, consider moving it to a separate branch or documenting its purpose. If it's no longer needed, it should be removed to improve code readability.
58-102: Clean up commented code in render function.Similar to the previous comment, the
renderNodefunction contains significant commented-out code that affects readability. Consider removing unused code or documenting its future purpose.cognee/modules/users/methods/get_authenticated_user.py (1)
8-48: Clean up commented authentication code.Since the migration to FastAPI Users is complete, consider removing the commented-out manual authentication code to improve code readability and reduce maintenance burden.
-# from types import SimpleNamespace - -# from ..get_fastapi_users import get_fastapi_users -# from fastapi import HTTPException, Security -# from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials -# import os -# import jwt - -# from uuid import UUID - -# fastapi_users = get_fastapi_users() - -# # Allows Swagger to understand authorization type and allow single sign on for the Swagger docs to test backend -# bearer_scheme = HTTPBearer(scheme_name="BearerAuth", description="Paste **Bearer <JWT>**") - - -# async def get_authenticated_user( -# creds: HTTPAuthorizationCredentials = Security(bearer_scheme), -# ) -> SimpleNamespace: -# """ -# Extract and validate the JWT presented in the Authorization header. -# """ -# if creds is None: # header missing -# raise HTTPException(status_code=401, detail="Not authenticated") - -# if creds.scheme.lower() != "bearer": # shouldn't happen extra guard -# raise HTTPException(status_code=401, detail="Invalid authentication scheme") - -# token = creds.credentials -# try: -# payload = jwt.decode( -# token, os.getenv("FASTAPI_USERS_JWT_SECRET", "super_secret"), algorithms=["HS256"] -# ) - -# auth_data = SimpleNamespace(id=UUID(payload["user_id"])) -# return auth_data - -# except jwt.ExpiredSignatureError: -# raise HTTPException(status_code=401, detail="Token has expired") -# except jwt.InvalidTokenError: -# raise HTTPException(status_code=401, detail="Invalid token")cognee-mcp/README.md (2)
52-82: Add language specifications to code blocks for better syntax highlighting.Multiple code blocks are missing language specifications, which affects readability and syntax highlighting.
Apply these fixes to improve code block formatting:
- ``` + ```bash git clone https://github.com/topoteretes/cognee.git - ``` + ``` - ``` + ```bash cd cognee/cognee-mcp - ``` + ``` - ``` + ```bash pip install uv - ``` + ``` - ``` + ```bash uv sync --dev --all-extras --reinstall - ``` + ``` - ``` + ```bash source .venv/bin/activate - ``` + ``` - ``` + ```env LLM_API_KEY="YOUR_OPENAI_API_KEY" - ``` + ``` - ``` + ```bash python src/server.py - ``` + ``` - ``` + ```bash python src/server.py --transport sse - ``` + ```
169-177: Add language specifications to remaining code blocks.The development section also has code blocks without language specifications.
Apply these fixes:
- ``` + ```toml #"cognee[postgres,codegraph,gemini,huggingface,docs,neo4j] @ file:/Users/<username>/Desktop/cognee" - ``` + ``` - ``` + ```bash uv sync --reinstall - ``` + ```cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py (1)
5-5: Remove unused imports.Static analysis correctly identified unused imports that should be removed for cleaner code.
Apply this fix:
-from typing import Dict, Any, List, Optional, Tuple +from typing import List, Optional, Tuplecognee/api/v1/add/routers/get_add_router.py (1)
39-43: Complete the TODO items for dataset integration.The TODO comments indicate incomplete dataset integration in the GitHub clone and URL fetch logic. These code paths don't utilize the provided dataset parameters, which could lead to data being added to the wrong dataset.
Would you like me to help implement the dataset integration for these code paths or create an issue to track this work?
Also applies to: 50-51
cognee-frontend/src/app/(graph)/CrewAITrigger.tsx (2)
13-16: Improve TypeScript type safety.The use of
anytypes reduces type safety. Consider defining proper interfaces for the data structures.- // eslint-disable-next-line @typescript-eslint/no-explicit-any - onData: (data: any) => void; - // eslint-disable-next-line @typescript-eslint/no-explicit-any - onActivity: (activities: any) => void; + onData: (data: { nodes: NodeData[]; links: LinkData[] } | null) => void; + onActivity: (activities: Activity[]) => void;Define the missing interfaces:
interface NodeData { id: string; type: string; // other node properties } interface LinkData { source: string; target: string; // other link properties } interface Activity { id: string; timestamp: number; activity: string; }
91-92: Ensure WebSocket cleanup in error scenarios.The WebSocket is closed in the
finallyblock, but there's a potential race condition if the WebSocketonmessagehandler closes it first. Consider adding a flag to prevent double closure.+ let websocketClosed = false; websocket.onmessage = (event) => { // ... existing code ... if (data.status === "PipelineRunCompleted") { + websocketClosed = true; websocket.close(); } }; // ... in finally block ... .finally(() => { - websocket.close(); + if (!websocketClosed) { + websocket.close(); + } setIsCrewAIRunning(false); });cognee-frontend/src/ui/Partials/SearchView/SearchView.tsx (2)
21-26: Consider making dataset configuration dynamic.The hardcoded
MAIN_DATASETlimits flexibility for multi-dataset scenarios mentioned in the PR objectives. Consider making this configurable via props or context.-const MAIN_DATASET = { - id: "", - data: [], - status: "", - name: "main_dataset", -}; +interface SearchViewProps { + dataset?: Dataset; +} -export default function SearchView() { +export default function SearchView({ dataset = DEFAULT_DATASET }: SearchViewProps) { // ... use dataset prop instead of MAIN_DATASET
130-134: Consider improving accessibility for the select element.The native select element could benefit from better accessibility attributes and consistent styling with other form elements.
- <Select name="searchType" defaultValue={searchOptions[0].value} className="max-w-2xs"> + <Select + name="searchType" + defaultValue={searchOptions[0].value} + className="max-w-2xs" + aria-label="Search type selection" + >cognee-frontend/src/app/(graph)/GraphControls.tsx (2)
134-135: Potential DOM manipulation issue.Direct DOM manipulation using
getElementByIdcan be unreliable in React. Consider using a ref or controlled component pattern instead.+ const graphShapeSelectRef = useRef<HTMLSelectElement>(null); // In the timeout callback: - const graphShapeSelectElement = document.getElementById("graph-shape-select") as HTMLSelectElement; - graphShapeSelectElement.value = newValue; + if (graphShapeSelectRef.current) { + graphShapeSelectRef.current.value = newValue; + } // In the JSX: - <Select defaultValue={DEFAULT_GRAPH_SHAPE} onChange={handleGraphShapeControl} id="graph-shape-select" className="flex-2/5"> + <Select ref={graphShapeSelectRef} defaultValue={DEFAULT_GRAPH_SHAPE} onChange={handleGraphShapeControl} className="flex-2/5">
138-138: Fix the timeout type casting.The type casting
as unknown as numberindicates a TypeScript issue. In browser environments,setTimeoutreturns a number, but Node.js returns aNodeJS.Timeout. Consider using a more robust approach.- shapeChangeTimeout.current = setTimeout(() => { + shapeChangeTimeout.current = window.setTimeout(() => { // ... callback code ... - }, 5000) as unknown as number; + }, 5000);And update the ref type:
- const shapeChangeTimeout = useRef<number | null>(); + const shapeChangeTimeout = useRef<number | null>(null);cognee/modules/pipelines/operations/pipeline.py (1)
76-77: Merge isinstance calls for better performance.The static analysis correctly identifies this optimization opportunity.
-if isinstance(datasets, str) or isinstance(datasets, UUID): +if isinstance(datasets, (str, UUID)):cognee/modules/search/methods/search.py (2)
43-57: Improve docstring completeness.The docstring is incomplete with placeholder text. Consider providing comprehensive documentation for this critical function.
+""" +Search function with optional permission-aware filtering. + +Args: + query_text: The search query string + query_type: Type of search to perform (SearchType enum) + dataset_ids: Optional list of dataset UUIDs to search within + user: User performing the search + system_prompt_path: Path to system prompt file + top_k: Maximum number of results to return + node_type: Optional node type filter for graph searches + node_name: Optional node name filter for graph searches + +Returns: + List of search results, format depends on query_type + +Notes: + When ENABLE_BACKEND_ACCESS_CONTROL=true, searches are filtered by user permissions +"""
59-62: Consider environment variable caching.Repeated environment variable lookups can be optimized by caching the value.
+# Cache at module level +_ACCESS_CONTROL_ENABLED = os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() == "true" -if os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() == "true": +if _ACCESS_CONTROL_ENABLED:cognee/api/v1/permissions/routers/get_permissions_router.py (1)
25-26: Optimize list comprehension.The static analysis correctly identifies an unnecessary comprehension.
-[dataset_id for dataset_id in dataset_ids], +list(dataset_ids),cognee/api/v1/cognify/cognify.py (1)
44-59: Consider removing unnecessary else clause.The else clause after return is unnecessary as suggested by Pylint.
- if run_in_background: - return await run_cognify_as_background_process( - tasks=tasks, - user=user, - datasets=datasets, - vector_db_config=vector_db_config, - graph_db_config=graph_db_config, - ) - else: - return await run_cognify_blocking( - tasks=tasks, - user=user, - datasets=datasets, - vector_db_config=vector_db_config, - graph_db_config=graph_db_config, - ) + if run_in_background: + return await run_cognify_as_background_process( + tasks=tasks, + user=user, + datasets=datasets, + vector_db_config=vector_db_config, + graph_db_config=graph_db_config, + ) + + return await run_cognify_blocking( + tasks=tasks, + user=user, + datasets=datasets, + vector_db_config=vector_db_config, + graph_db_config=graph_db_config, + )cognee/api/v1/cognify/routers/get_cognify_router.py (1)
76-77: Consider simplifying nested context managers.As suggested by Ruff, multiple contexts can be combined into a single statement.
- async with db_engine.get_async_session() as session: - async with get_user_db_context(session) as user_db: + async with ( + db_engine.get_async_session() as session, + get_user_db_context(session) as user_db, + ):cognee/modules/ontology/rdf_xml/OntologyResolver.py (1)
126-202: Consider refactoring get_subgraph method to reduce complexity.The method works correctly but has high cyclomatic complexity. The BFS traversal logic and RDF triple pattern matching are sound, but the method could benefit from being broken down into smaller functions.
Consider extracting helper methods like:
_process_individual_relations(current, visited, queue, nodes_set, edges)_process_subclass_relations(current, visited, queue, nodes_set, edges)_process_object_properties(current, obj_props, visited, queue, nodes_set, edges, directed)This would improve readability and make the method easier to test and maintain.
cognee/api/v1/datasets/routers/get_datasets_router.py (1)
153-170: Graph endpoint returns structured data correctly.The modified endpoint now returns structured
GraphDTOdata instead of string URLs, which is more useful for frontend consumption. However, consider improving error handling specificity.Consider catching specific exceptions instead of the generic
Exceptionand returning more specific error responses:- except Exception: + except (DatasetNotFoundError, DataNotFoundError) as e: return JSONResponse( - status_code=409, - content="Error retrieving dataset graph data.", + status_code=404, + content={"detail": str(e)}, + ) + except Exception as e: + logger.error(f"Unexpected error retrieving dataset graph: {str(e)}") + return JSONResponse( + status_code=500, + content={"detail": "Internal server error retrieving dataset graph."}, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (27)
.dlt/config.tomlis excluded by!**/*.toml.github/workflows/backend_docker_build_test.ymlis excluded by!**/*.yml.github/workflows/cd.yamlis excluded by!**/*.yaml.github/workflows/cd_prd.yamlis excluded by!**/*.yaml.github/workflows/e2e_tests.ymlis excluded by!**/*.yml.github/workflows/test_gemini.ymlis excluded by!**/*.yml.github/workflows/test_mcp.ymlis excluded by!**/*.yml.github/workflows/test_memgraph.ymlis excluded by!**/*.yml.github/workflows/test_suites.ymlis excluded by!**/*.yml.github/workflows/vector_db_tests.ymlis excluded by!**/*.ymlcognee-frontend/package-lock.jsonis excluded by!**/package-lock.json,!**/*.jsoncognee-frontend/package.jsonis excluded by!**/*.jsoncognee-frontend/public/images/cognee-logo-with-text.pngis excluded by!**/*.png,!**/*.pngcognee-frontend/public/images/crewai.pngis excluded by!**/*.png,!**/*.pngcognee-frontend/public/images/deepnote.svgis excluded by!**/*.svg,!**/*.svgcognee-frontend/public/images/lancedb.svgis excluded by!**/*.svg,!**/*.svgcognee-frontend/public/images/neo4j.pngis excluded by!**/*.png,!**/*.pngcognee-frontend/src/app/(graph)/example_data.jsonis excluded by!**/*.jsoncognee-mcp/pyproject.tomlis excluded by!**/*.tomlcognee-starter-kit/pyproject.tomlis excluded by!**/*.tomlcognee-starter-kit/src/data/companies.jsonis excluded by!**/*.jsoncognee-starter-kit/src/data/people.jsonis excluded by!**/*.jsonevals/comparative_eval/hotpot_50_corpus.jsonis excluded by!**/*.jsonevals/comparative_eval/hotpot_50_qa_pairs.jsonis excluded by!**/*.jsonpoetry.lockis excluded by!**/*.lock,!**/*.lockpyproject.tomlis excluded by!**/*.tomluv.lockis excluded by!**/*.lock,!**/*.lock
📒 Files selected for processing (207)
.env.template(2 hunks)CONTRIBUTING.md(1 hunks)README.md(1 hunks)alembic/versions/ab7e313804ae_permission_system_rework.py(1 hunks)cognee-frontend/.prettierignore(1 hunks)cognee-frontend/.prettierrc(1 hunks)cognee-frontend/Dockerfile(2 hunks)cognee-frontend/eslint.config.mjs(1 hunks)cognee-frontend/postcss.config.mjs(1 hunks)cognee-frontend/src/app/(graph)/ActivityLog.tsx(1 hunks)cognee-frontend/src/app/(graph)/CogneeAddWidget.tsx(1 hunks)cognee-frontend/src/app/(graph)/CrewAITrigger.tsx(1 hunks)cognee-frontend/src/app/(graph)/GraphControls.tsx(1 hunks)cognee-frontend/src/app/(graph)/GraphLegend.tsx(1 hunks)cognee-frontend/src/app/(graph)/GraphView.tsx(1 hunks)cognee-frontend/src/app/(graph)/GraphVisualization.tsx(1 hunks)cognee-frontend/src/app/(graph)/getColorForNodeType.ts(1 hunks)cognee-frontend/src/app/auth/AuthForm.tsx(1 hunks)cognee-frontend/src/app/auth/AuthPage.module.css(0 hunks)cognee-frontend/src/app/auth/AuthPage.tsx(1 hunks)cognee-frontend/src/app/auth/layout.tsx(1 hunks)cognee-frontend/src/app/auth/login/LoginPage.tsx(1 hunks)cognee-frontend/src/app/auth/login/page.tsx(1 hunks)cognee-frontend/src/app/auth/page.tsx(1 hunks)cognee-frontend/src/app/auth/signup/SignUpPage.tsx(1 hunks)cognee-frontend/src/app/auth/signup/page.tsx(1 hunks)cognee-frontend/src/app/auth/token/route.ts(1 hunks)cognee-frontend/src/app/globals.css(1 hunks)cognee-frontend/src/app/layout.tsx(1 hunks)cognee-frontend/src/app/page.tsx(1 hunks)cognee-frontend/src/app/wizard/AddStep/AddStep.module.css(0 hunks)cognee-frontend/src/app/wizard/AddStep/AddStep.tsx(0 hunks)cognee-frontend/src/app/wizard/AddStep/index.ts(0 hunks)cognee-frontend/src/app/wizard/CognifyStep/CognifyStep.tsx(0 hunks)cognee-frontend/src/app/wizard/CognifyStep/index.ts(0 hunks)cognee-frontend/src/app/wizard/ConfigStep/ConfigStep.tsx(0 hunks)cognee-frontend/src/app/wizard/ConfigStep/index.ts(0 hunks)cognee-frontend/src/app/wizard/ExploreStep/ExploreStep.tsx(0 hunks)cognee-frontend/src/app/wizard/ExploreStep/index.ts(0 hunks)cognee-frontend/src/app/wizard/WizardPage.module.css(0 hunks)cognee-frontend/src/app/wizard/WizardPage.tsx(0 hunks)cognee-frontend/src/app/wizard/page.tsx(0 hunks)cognee-frontend/src/middleware.ts(1 hunks)cognee-frontend/src/modules/auth/auth0.ts(1 hunks)cognee-frontend/src/modules/chat/api/getHistory.ts(1 hunks)cognee-frontend/src/modules/chat/hooks/useChat.ts(1 hunks)cognee-frontend/src/modules/datasets/cognifyDataset.ts(1 hunks)cognee-frontend/src/modules/datasets/createDataset.ts(1 hunks)cognee-frontend/src/modules/datasets/getDatasetGraph.ts(1 hunks)cognee-frontend/src/modules/exploration/getExplorationGraphUrl.ts(1 hunks)cognee-frontend/src/modules/ingestion/DataView/DataView.module.css(0 hunks)cognee-frontend/src/modules/ingestion/DataView/DataView.tsx(0 hunks)cognee-frontend/src/modules/ingestion/DataView/RawDataPreview.module.css(0 hunks)cognee-frontend/src/modules/ingestion/DataView/RawDataPreview.tsx(0 hunks)cognee-frontend/src/modules/ingestion/DataView/index.ts(0 hunks)cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.module.css(0 hunks)cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.tsx(0 hunks)cognee-frontend/src/modules/ingestion/DatasetsView/StatusIcon.tsx(0 hunks)cognee-frontend/src/modules/ingestion/DatasetsView/index.ts(0 hunks)cognee-frontend/src/modules/ingestion/addData.ts(1 hunks)cognee-frontend/src/modules/ingestion/useDatasets.ts(3 hunks)cognee-frontend/src/ui/App/Loading/DefaultLoadingIndicator/LoadingIndicator.module.css(1 hunks)cognee-frontend/src/ui/Icons/AddIcon.tsx(1 hunks)cognee-frontend/src/ui/Icons/CaretIcon.tsx(1 hunks)cognee-frontend/src/ui/Icons/DeleteIcon.tsx(1 hunks)cognee-frontend/src/ui/Icons/GitHubIcon.tsx(1 hunks)cognee-frontend/src/ui/Icons/SearchIcon.tsx(1 hunks)cognee-frontend/src/ui/Icons/index.ts(1 hunks)cognee-frontend/src/ui/Partials/Explorer/Explorer.module.css(0 hunks)cognee-frontend/src/ui/Partials/Explorer/Explorer.tsx(0 hunks)cognee-frontend/src/ui/Partials/FeedbackForm.tsx(1 hunks)cognee-frontend/src/ui/Partials/Footer/Footer.module.css(0 hunks)cognee-frontend/src/ui/Partials/Footer/Footer.tsx(1 hunks)cognee-frontend/src/ui/Partials/SearchView/SearchView.module.css(1 hunks)cognee-frontend/src/ui/Partials/SearchView/SearchView.tsx(1 hunks)cognee-frontend/src/ui/Partials/SettingsModal/Settings.tsx(1 hunks)cognee-frontend/src/ui/Partials/SettingsModal/SettingsModal.tsx(1 hunks)cognee-frontend/src/ui/Partials/SignInForm/SignInForm.tsx(3 hunks)cognee-frontend/src/ui/Partials/Wizard/WizardContent/WizardContent.module.css(0 hunks)cognee-frontend/src/ui/Partials/Wizard/WizardContent/WizardContent.tsx(0 hunks)cognee-frontend/src/ui/Partials/Wizard/WizardHeading.tsx(0 hunks)cognee-frontend/src/ui/Partials/Wizard/index.ts(0 hunks)cognee-frontend/src/ui/Partials/index.ts(1 hunks)cognee-frontend/src/ui/elements/CTAButton.tsx(1 hunks)cognee-frontend/src/ui/elements/GhostButton.tsx(1 hunks)cognee-frontend/src/ui/elements/Input.tsx(1 hunks)cognee-frontend/src/ui/elements/Modal.tsx(1 hunks)cognee-frontend/src/ui/elements/NeutralButton.tsx(1 hunks)cognee-frontend/src/ui/elements/Select.tsx(1 hunks)cognee-frontend/src/ui/elements/StatusIndicator.tsx(1 hunks)cognee-frontend/src/ui/elements/TextArea.tsx(1 hunks)cognee-frontend/src/ui/elements/index.ts(1 hunks)cognee-frontend/src/utils/fetch.ts(1 hunks)cognee-frontend/src/utils/handleServerErrors.ts(1 hunks)cognee-frontend/src/utils/index.ts(1 hunks)cognee-frontend/src/utils/useBoolean.ts(1 hunks)cognee-frontend/types/d3-force-3d.d.ts(1 hunks)cognee-mcp/README.md(2 hunks)cognee-mcp/src/server.py(6 hunks)cognee-mcp/src/test_client.py(1 hunks)cognee-starter-kit/.env.template(1 hunks)cognee-starter-kit/.gitignore(1 hunks)cognee-starter-kit/README.md(1 hunks)cognee-starter-kit/src/pipelines/custom-model.py(1 hunks)cognee-starter-kit/src/pipelines/default.py(1 hunks)cognee-starter-kit/src/pipelines/low_level.py(1 hunks)cognee/__init__.py(1 hunks)cognee/api/client.py(3 hunks)cognee/api/v1/add/add.py(2 hunks)cognee/api/v1/add/routers/get_add_router.py(4 hunks)cognee/api/v1/cognify/code_graph_pipeline.py(3 hunks)cognee/api/v1/cognify/cognify.py(3 hunks)cognee/api/v1/cognify/routers/get_cognify_router.py(1 hunks)cognee/api/v1/datasets/routers/get_datasets_router.py(5 hunks)cognee/api/v1/delete/exceptions.py(1 hunks)cognee/api/v1/delete/routers/get_delete_router.py(1 hunks)cognee/api/v1/permissions/routers/get_permissions_router.py(1 hunks)cognee/api/v1/search/routers/get_search_router.py(3 hunks)cognee/api/v1/search/search.py(1 hunks)cognee/api/v1/settings/routers/get_settings_router.py(1 hunks)cognee/api/v1/users/routers/get_auth_router.py(1 hunks)cognee/api/v1/users/routers/get_visualize_router.py(1 hunks)cognee/api/v1/visualize/visualize.py(2 hunks)cognee/context_global_variables.py(1 hunks)cognee/eval_framework/analysis/metrics_calculator.py(1 hunks)cognee/eval_framework/corpus_builder/corpus_builder_executor.py(1 hunks)cognee/eval_framework/corpus_builder/task_getters/get_cascade_graph_tasks.py(2 hunks)cognee/eval_framework/evaluation/deep_eval_adapter.py(3 hunks)cognee/exceptions/exceptions.py(1 hunks)cognee/fetch_secret.py(0 hunks)cognee/infrastructure/databases/graph/config.py(1 hunks)cognee/infrastructure/databases/graph/get_graph_engine.py(3 hunks)cognee/infrastructure/databases/graph/kuzu/adapter.py(3 hunks)cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py(1 hunks)cognee/infrastructure/databases/graph/kuzu/show_remote_kuzu_stats.py(1 hunks)cognee/infrastructure/databases/graph/memgraph/memgraph_adapter.py(2 hunks)cognee/infrastructure/databases/graph/neo4j_driver/adapter.py(20 hunks)cognee/infrastructure/databases/graph/neo4j_driver/neo4j_metrics_utils.py(2 hunks)cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py(1 hunks)cognee/infrastructure/databases/utils/__init__.py(1 hunks)cognee/infrastructure/databases/utils/get_or_create_dataset_database.py(1 hunks)cognee/infrastructure/databases/vector/config.py(1 hunks)cognee/infrastructure/databases/vector/create_vector_engine.py(2 hunks)cognee/infrastructure/databases/vector/get_vector_engine.py(1 hunks)cognee/infrastructure/databases/vector/lancedb/LanceDBAdapter.py(1 hunks)cognee/infrastructure/databases/vector/pgvector/create_db_and_tables.py(1 hunks)cognee/infrastructure/databases/vector/qdrant/QDrantAdapter.py(4 hunks)cognee/modules/data/exceptions/__init__.py(1 hunks)cognee/modules/data/exceptions/exceptions.py(1 hunks)cognee/modules/data/methods/__init__.py(1 hunks)cognee/modules/data/methods/check_dataset_name.py(1 hunks)cognee/modules/data/methods/create_dataset.py(0 hunks)cognee/modules/data/methods/get_authorized_existing_datasets.py(1 hunks)cognee/modules/data/methods/get_dataset_ids.py(1 hunks)cognee/modules/data/methods/get_unique_dataset_id.py(1 hunks)cognee/modules/data/methods/load_or_create_datasets.py(1 hunks)cognee/modules/data/models/Data.py(0 hunks)cognee/modules/data/models/Dataset.py(1 hunks)cognee/modules/data/processing/document_types/open_data_file.py(1 hunks)cognee/modules/graph/methods/__init__.py(1 hunks)cognee/modules/graph/methods/get_formatted_graph_data.py(1 hunks)cognee/modules/graph/utils/expand_with_nodes_and_edges.py(4 hunks)cognee/modules/ingestion/classify.py(1 hunks)cognee/modules/ingestion/data_types/__init__.py(0 hunks)cognee/modules/metrics/operations/get_pipeline_run_metrics.py(2 hunks)cognee/modules/ontology/rdf_xml/OntologyResolver.py(3 hunks)cognee/modules/pipelines/methods/__init__.py(1 hunks)cognee/modules/pipelines/methods/get_pipeline_run.py(1 hunks)cognee/modules/pipelines/models/PipelineRunInfo.py(1 hunks)cognee/modules/pipelines/models/__init__.py(1 hunks)cognee/modules/pipelines/operations/log_pipeline_run_initiated.py(1 hunks)cognee/modules/pipelines/operations/log_pipeline_run_start.py(2 hunks)cognee/modules/pipelines/operations/pipeline.py(6 hunks)cognee/modules/pipelines/operations/run_tasks.py(2 hunks)cognee/modules/pipelines/queues/pipeline_run_info_queues.py(1 hunks)cognee/modules/pipelines/utils/__init__.py(1 hunks)cognee/modules/pipelines/utils/generate_pipeline_id.py(1 hunks)cognee/modules/pipelines/utils/generate_pipeline_run_id.py(1 hunks)cognee/modules/retrieval/exceptions/exceptions.py(1 hunks)cognee/modules/retrieval/graph_completion_retriever.py(3 hunks)cognee/modules/retrieval/utils/brute_force_triplet_search.py(1 hunks)cognee/modules/retrieval/utils/description_to_codepart_search.py(2 hunks)cognee/modules/search/methods/search.py(4 hunks)cognee/modules/search/operations/get_history.py(1 hunks)cognee/modules/users/authentication/api_bearer/__init__.py(1 hunks)cognee/modules/users/authentication/api_bearer/api_bearer_transport.py(1 hunks)cognee/modules/users/authentication/api_bearer/api_jwt_strategy.py(1 hunks)cognee/modules/users/authentication/default/__init__.py(1 hunks)cognee/modules/users/authentication/default/default_jwt_strategy.py(1 hunks)cognee/modules/users/authentication/default/default_transport.py(1 hunks)cognee/modules/users/authentication/get_api_auth_backend.py(1 hunks)cognee/modules/users/authentication/get_auth_backend.py(0 hunks)cognee/modules/users/authentication/get_client_auth_backend.py(1 hunks)cognee/modules/users/exceptions/__init__.py(1 hunks)cognee/modules/users/exceptions/exceptions.py(1 hunks)cognee/modules/users/get_fastapi_users.py(1 hunks)cognee/modules/users/get_user_manager.py(2 hunks)cognee/modules/users/methods/__init__.py(1 hunks)cognee/modules/users/methods/get_authenticated_user.py(1 hunks)cognee/modules/users/methods/get_default_user.py(2 hunks)cognee/modules/users/methods/get_user.py(2 hunks)cognee/modules/users/methods/get_user_by_email.py(1 hunks)cognee/modules/users/models/ACL.py(1 hunks)cognee/modules/users/models/DatasetDatabase.py(1 hunks)cognee/modules/users/models/Tenant.py(1 hunks)cognee/modules/users/models/User.py(2 hunks)cognee/modules/users/models/__init__.py(1 hunks)
⛔ Files not processed due to max files limit (56)
- cognee/modules/users/permissions/init.py
- cognee/modules/users/permissions/methods/init.py
- cognee/modules/users/permissions/methods/authorized_give_permission_on_datasets.py
- cognee/modules/users/permissions/methods/check_permission_on_dataset.py
- cognee/modules/users/permissions/methods/get_all_user_permission_datasets.py
- cognee/modules/users/permissions/methods/get_document_ids_for_user.py
- cognee/modules/users/permissions/methods/get_principal.py
- cognee/modules/users/permissions/methods/get_principal_datasets.py
- cognee/modules/users/permissions/methods/get_role.py
- cognee/modules/users/permissions/methods/get_specific_user_permission_datasets.py
- cognee/modules/users/permissions/methods/get_tenant.py
- cognee/modules/users/permissions/methods/give_permission_on_dataset.py
- cognee/modules/users/permissions/methods/give_permission_on_document.py
- cognee/modules/users/permissions/permission_types.py
- cognee/modules/users/roles/methods/add_user_to_role.py
- cognee/modules/users/roles/methods/create_role.py
- cognee/modules/users/tenants/methods/init.py
- cognee/modules/users/tenants/methods/add_user_to_tenant.py
- cognee/modules/users/tenants/methods/create_tenant.py
- cognee/shared/logging_utils.py
- cognee/tasks/documents/init.py
- cognee/tasks/documents/check_permissions_on_dataset.py
- cognee/tasks/documents/detect_language.py
- cognee/tasks/documents/translate_text.py
- cognee/tasks/ingestion/ingest_data.py
- cognee/tasks/ingestion/resolve_data_directories.py
- cognee/tests/test_cognee_server_start.py
- cognee/tests/test_parallel_databases.py
- cognee/tests/test_pgvector.py
- cognee/tests/test_qdrant.py
- cognee/tests/test_remote_kuzu.py
- cognee/tests/test_remote_kuzu_stress.py
- cognee/tests/test_starter_pipelines.py
- cognee/tests/unit/modules/ontology/test_ontology_adapter.py
- cognee/tests/unit/modules/search/search_methods_test.py
- evals/comparative_eval/README.md
- evals/comparative_eval/helpers/calculate_aggregate_metrics.py
- evals/comparative_eval/helpers/convert_metrics.py
- evals/comparative_eval/helpers/modal_evaluate_answers.py
- evals/comparative_eval/qa_benchmark_base.py
- evals/comparative_eval/qa_benchmark_graphiti.py
- evals/comparative_eval/qa_benchmark_lightrag.py
- evals/comparative_eval/qa_benchmark_mem0.py
- examples/python/agentic_reasoning_procurement_example.py
- examples/python/code_graph_example.py
- examples/python/dynamic_steps_example.py
- examples/python/graphiti_example.py
- examples/python/multimedia_example.py
- examples/python/ontology_demo_example.py
- examples/python/ontology_demo_example_2.py
- examples/python/simple_example.py
- examples/python/simple_node_set_example.py
- examples/relational_db_with_dlt/fix_foreign_keys.sql
- examples/relational_db_with_dlt/relational_db_and_dlt.py
- notebooks/cognee_demo.ipynb
- notebooks/cognee_simple_demo.ipynb
💤 Files with no reviewable changes (34)
- cognee-frontend/src/app/wizard/AddStep/AddStep.module.css
- cognee-frontend/src/app/wizard/ExploreStep/index.ts
- cognee-frontend/src/modules/ingestion/DatasetsView/index.ts
- cognee-frontend/src/app/wizard/AddStep/index.ts
- cognee/modules/data/models/Data.py
- cognee/modules/ingestion/data_types/init.py
- cognee/modules/data/methods/create_dataset.py
- cognee-frontend/src/ui/Partials/Wizard/WizardHeading.tsx
- cognee-frontend/src/app/auth/AuthPage.module.css
- cognee-frontend/src/app/wizard/ConfigStep/index.ts
- cognee-frontend/src/ui/Partials/Wizard/WizardContent/WizardContent.module.css
- cognee-frontend/src/ui/Partials/Wizard/index.ts
- cognee-frontend/src/app/wizard/ConfigStep/ConfigStep.tsx
- cognee-frontend/src/modules/ingestion/DataView/DataView.module.css
- cognee-frontend/src/modules/ingestion/DataView/RawDataPreview.module.css
- cognee-frontend/src/ui/Partials/Footer/Footer.module.css
- cognee-frontend/src/modules/ingestion/DatasetsView/StatusIcon.tsx
- cognee-frontend/src/ui/Partials/Wizard/WizardContent/WizardContent.tsx
- cognee-frontend/src/ui/Partials/Explorer/Explorer.tsx
- cognee-frontend/src/app/wizard/page.tsx
- cognee-frontend/src/app/wizard/WizardPage.module.css
- cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.tsx
- cognee-frontend/src/app/wizard/ExploreStep/ExploreStep.tsx
- cognee-frontend/src/app/wizard/CognifyStep/index.ts
- cognee-frontend/src/ui/Partials/Explorer/Explorer.module.css
- cognee-frontend/src/app/wizard/CognifyStep/CognifyStep.tsx
- cognee-frontend/src/modules/ingestion/DataView/index.ts
- cognee-frontend/src/app/wizard/AddStep/AddStep.tsx
- cognee-frontend/src/app/wizard/WizardPage.tsx
- cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.module.css
- cognee-frontend/src/modules/ingestion/DataView/RawDataPreview.tsx
- cognee-frontend/src/modules/ingestion/DataView/DataView.tsx
- cognee/modules/users/authentication/get_auth_backend.py
- cognee/fetch_secret.py
🧰 Additional context used
🪛 Ruff (0.11.9)
cognee/api/v1/settings/routers/get_settings_router.py
50-50: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
cognee/modules/users/exceptions/__init__.py
12-12: .exceptions.PermissionNotFoundError imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/modules/users/authentication/default/__init__.py
1-1: .default_transport.default_transport imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
2-2: .default_jwt_strategy.DefaultJWTStrategy imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/modules/users/models/__init__.py
4-4: .DatasetDatabase.DatasetDatabase imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/infrastructure/databases/utils/__init__.py
1-1: .get_or_create_dataset_database.get_or_create_dataset_database imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/modules/data/exceptions/__init__.py
10-10: .exceptions.DatasetNotFoundError imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
11-11: .exceptions.DatasetTypeError imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/modules/pipelines/methods/__init__.py
1-1: .get_pipeline_run.get_pipeline_run imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/modules/graph/methods/__init__.py
1-1: .get_formatted_graph_data.get_formatted_graph_data imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/modules/users/authentication/api_bearer/__init__.py
1-1: .api_bearer_transport.api_bearer_transport imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
2-2: .api_jwt_strategy.APIJWTStrategy imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/modules/users/methods/__init__.py
5-5: .get_user_by_email.get_user_by_email imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/modules/pipelines/utils/__init__.py
1-1: .generate_pipeline_id.generate_pipeline_id imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
2-2: .generate_pipeline_run_id.generate_pipeline_run_id imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/modules/pipelines/models/__init__.py
3-3: .PipelineRunInfo.PipelineRunInfo imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
4-4: .PipelineRunInfo.PipelineRunStarted imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
5-5: .PipelineRunInfo.PipelineRunYield imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
6-6: .PipelineRunInfo.PipelineRunCompleted imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
7-7: .PipelineRunInfo.PipelineRunErrored imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee/modules/data/methods/__init__.py
11-11: .get_authorized_existing_datasets.get_authorized_existing_datasets imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
12-12: .get_dataset_ids.get_dataset_ids imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
15-15: .delete_dataset.delete_dataset imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
16-16: .delete_data.delete_data imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
19-19: .load_or_create_datasets.load_or_create_datasets imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
22-22: .check_dataset_name.check_dataset_name imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
cognee-mcp/src/test_client.py
123-124: Use a single with statement with multiple contexts instead of nested with statements
(SIM117)
207-207: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
244-244: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
281-281: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
cognee/modules/users/methods/get_user.py
4-4: sqlalchemy.exc imported but unused
Remove unused import: sqlalchemy.exc
(F401)
cognee-mcp/src/server.py
85-85: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
cognee/infrastructure/databases/vector/pgvector/create_db_and_tables.py
3-3: cognee.context_global_variables.vector_db_config imported but unused
Remove unused import: cognee.context_global_variables.vector_db_config
(F401)
cognee/modules/users/authentication/default/default_jwt_strategy.py
1-1: jwt imported but unused
Remove unused import: jwt
(F401)
2-2: uuid.UUID imported but unused
Remove unused import: uuid.UUID
(F401)
3-3: fastapi_users.jwt.generate_jwt imported but unused
Remove unused import: fastapi_users.jwt.generate_jwt
(F401)
6-6: cognee.modules.users.models.User imported but unused
Remove unused import: cognee.modules.users.models.User
(F401)
7-7: cognee.modules.users.get_user_manager.UserManager imported but unused
Remove unused import: cognee.modules.users.get_user_manager.UserManager
(F401)
cognee-starter-kit/src/pipelines/low_level.py
38-38: Use a context manager for opening files
(SIM115)
41-41: Use a context manager for opening files
(SIM115)
cognee/context_global_variables.py
35-35: Use os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() != "true" instead of not os.getenv("ENABLE_BACKEND_ACCESS_CONTROL", "false").lower() == "true"
Replace with != operator
(SIM201)
cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py
5-5: typing.Dict imported but unused
Remove unused import
(F401)
5-5: typing.Any imported but unused
Remove unused import
(F401)
cognee/api/v1/add/routers/get_add_router.py
24-24: Do not perform function call Form in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
cognee/modules/ingestion/classify.py
20-20: Multiple isinstance calls for data, merge into a single call
Merge isinstance calls for data
(SIM101)
cognee/api/v1/search/routers/get_search_router.py
34-34: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
cognee/api/v1/search/search.py
24-24: Multiple isinstance calls for datasets, merge into a single call
Merge isinstance calls for datasets
(SIM101)
cognee/api/v1/permissions/routers/get_permissions_router.py
19-19: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
35-35: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
44-44: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
54-54: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
63-63: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
cognee/modules/pipelines/operations/pipeline.py
76-76: Multiple isinstance calls for datasets, merge into a single call
Merge isinstance calls for datasets
(SIM101)
cognee/api/v1/cognify/cognify.py
100-100: Undefined name anext. Consider specifying requires-python = ">= 3.10" or tool.ruff.target-version = "py310" in your pyproject.toml file.
(F821)
105-105: Undefined name anext. Consider specifying requires-python = ">= 3.10" or tool.ruff.target-version = "py310" in your pyproject.toml file.
(F821)
cognee/api/v1/cognify/routers/get_cognify_router.py
43-43: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
76-77: Use a single with statement with multiple contexts instead of nested with statements
(SIM117)
cognee/api/v1/datasets/routers/get_datasets_router.py
75-75: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
89-89: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
154-154: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
🪛 dotenv-linter (3.3.0)
cognee-starter-kit/.env.template
[warning] 2-2: [QuoteCharacter] The value has quote characters (', ")
[warning] 3-3: [QuoteCharacter] The value has quote characters (', ")
[warning] 4-4: [QuoteCharacter] The value has quote characters (', ")
[warning] 6-6: [QuoteCharacter] The value has quote characters (', ")
[warning] 6-6: [UnorderedKey] The LLM_ENDPOINT key should go before the LLM_MODEL key
[warning] 7-7: [QuoteCharacter] The value has quote characters (', ")
[warning] 7-7: [UnorderedKey] The LLM_API_VERSION key should go before the LLM_ENDPOINT key
[warning] 10-10: [QuoteCharacter] The value has quote characters (', ")
[warning] 11-11: [QuoteCharacter] The value has quote characters (', ")
[warning] 12-12: [QuoteCharacter] The value has quote characters (', ")
[warning] 14-14: [QuoteCharacter] The value has quote characters (', ")
[warning] 14-14: [UnorderedKey] The EMBEDDING_ENDPOINT key should go before the EMBEDDING_MODEL key
[warning] 15-15: [QuoteCharacter] The value has quote characters (', ")
[warning] 15-15: [UnorderedKey] The EMBEDDING_API_VERSION key should go before the EMBEDDING_ENDPOINT key
[warning] 17-17: [ExtraBlankLine] Extra blank line detected
[warning] 18-18: [QuoteCharacter] The value has quote characters (', ")
[warning] 19-19: [EndingBlankLine] No blank line at the end of the file
[warning] 19-19: [QuoteCharacter] The value has quote characters (', ")
[warning] 19-19: [UnorderedKey] The GRAPHISTRY_PASSWORD key should go before the GRAPHISTRY_USERNAME key
🪛 Pylint (3.3.7)
cognee/modules/users/authentication/api_bearer/api_jwt_strategy.py
[refactor] 4-4: Too few public methods (0/2)
(R0903)
cognee-starter-kit/src/pipelines/custom-model.py
[refactor] 33-33: Too few public methods (0/2)
(R0903)
[refactor] 36-36: Too few public methods (0/2)
(R0903)
[refactor] 41-41: Too few public methods (0/2)
(R0903)
[refactor] 44-44: Too few public methods (0/2)
(R0903)
cognee-mcp/src/test_client.py
[refactor] 201-204: Unnecessary "elif" after "break", remove the leading "el" from "elif"
(R1723)
[refactor] 238-241: Unnecessary "elif" after "break", remove the leading "el" from "elif"
(R1723)
[refactor] 275-278: Unnecessary "elif" after "break", remove the leading "el" from "elif"
(R1723)
cognee/modules/data/methods/load_or_create_datasets.py
[refactor] 23-23: Consider merging these comparisons with 'in' by using 'identifier in (ds.name, ds.id)'. Use a set instead if elements are hashable.
(R1714)
cognee/modules/users/models/DatasetDatabase.py
[refactor] 7-7: Too few public methods (0/2)
(R0903)
cognee/modules/users/authentication/default/default_jwt_strategy.py
[refactor] 10-10: Too few public methods (0/2)
(R0903)
cognee-starter-kit/src/pipelines/low_level.py
[refactor] 14-14: Too few public methods (0/2)
(R0903)
[refactor] 19-19: Too few public methods (0/2)
(R0903)
[refactor] 25-25: Too few public methods (0/2)
(R0903)
[refactor] 29-29: Too few public methods (0/2)
(R0903)
[refactor] 38-38: Consider using 'with' for resource-allocating operations
(R1732)
[refactor] 41-41: Consider using 'with' for resource-allocating operations
(R1732)
cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py
[refactor] 84-114: Too many nested blocks (6/5)
(R1702)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py
[refactor] 544-546: Consider using '{"node_id": node_id}' instead of a call to 'dict'.
(R1735)
cognee/modules/ingestion/classify.py
[refactor] 20-20: Consider merging these isinstance calls to isinstance(data, (BufferedReader, SpooledTemporaryFile))
(R1701)
cognee/api/v1/search/routers/get_search_router.py
[refactor] 16-16: Too few public methods (0/2)
(R0903)
cognee/modules/pipelines/models/PipelineRunInfo.py
[refactor] 6-6: Too few public methods (0/2)
(R0903)
[refactor] 16-16: Too few public methods (0/2)
(R0903)
[refactor] 21-21: Too few public methods (0/2)
(R0903)
[refactor] 26-26: Too few public methods (0/2)
(R0903)
[refactor] 31-31: Too few public methods (0/2)
(R0903)
cognee/modules/search/methods/search.py
[refactor] 33-33: Too many arguments (8/5)
(R0913)
[refactor] 33-33: Too many positional arguments (8/5)
(R0917)
[refactor] 146-146: Too many arguments (6/5)
(R0913)
[refactor] 146-146: Too many positional arguments (6/5)
(R0917)
[refactor] 174-174: Too many arguments (6/5)
(R0913)
[refactor] 174-174: Too many positional arguments (6/5)
(R0917)
[refactor] 187-187: Too many arguments (6/5)
(R0913)
[refactor] 187-187: Too many positional arguments (6/5)
(R0917)
cognee/api/v1/search/search.py
[refactor] 12-12: Too many arguments (9/5)
(R0913)
[refactor] 12-12: Too many positional arguments (9/5)
(R0917)
[refactor] 24-24: Consider merging these isinstance calls to isinstance(datasets, (UUID, str))
(R1701)
cognee/infrastructure/databases/vector/qdrant/QDrantAdapter.py
[refactor] 156-159: Unnecessary "elif" after "return", remove the leading "el" from "elif"
(R1705)
[error] 413-413: Possibly using variable 'collection_size' before assignment
(E0606)
cognee/api/v1/permissions/routers/get_permissions_router.py
[refactor] 25-25: Unnecessary use of a comprehension, use list(dataset_ids) instead.
(R1721)
cognee/modules/pipelines/operations/pipeline.py
[refactor] 76-76: Consider merging these isinstance calls to isinstance(datasets, (UUID, str))
(R1701)
cognee/api/v1/cognify/cognify.py
[refactor] 44-59: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
cognee/api/v1/cognify/routers/get_cognify_router.py
[refactor] 32-32: Too few public methods (0/2)
(R0903)
cognee/api/v1/datasets/routers/get_datasets_router.py
[refactor] 50-50: Too few public methods (0/2)
(R0903)
[refactor] 56-56: Too few public methods (0/2)
(R0903)
[refactor] 62-62: Too few public methods (0/2)
(R0903)
[refactor] 67-67: Too few public methods (0/2)
(R0903)
[refactor] 71-71: Too many statements (61/50)
(R0915)
cognee/modules/ontology/rdf_xml/OntologyResolver.py
[refactor] 17-17: Too few public methods (1/2)
(R0903)
[refactor] 126-126: Too many local variables (24/15)
(R0914)
[refactor] 126-126: Too many branches (15/12)
(R0912)
[refactor] 126-126: Too many statements (51/50)
(R0915)
🪛 Biome (1.9.4)
cognee-frontend/src/app/globals.css
[error] 30-30: This @import is in the wrong position.
Any @import rules must precede all other valid at-rules and style rules in a stylesheet (ignoring @charset and @layer), or else the @import rule is invalid.
Consider moving import position.
(lint/correctness/noInvalidPositionAtImportRule)
🪛 LanguageTool
cognee-starter-kit/README.md
[uncategorized] ~90-~90: You might be missing the article “an” here.
Context: ...D="" ``` Note: GRAPHISTRY_PASSWORD is API key. # What will you build with cogne...
(AI_EN_LECTOR_MISSING_DETERMINER_AN)
cognee-mcp/README.md
[style] ~165-~165: Consider a more concise word here.
Context: ...cp dev src/server.py` ### Development In order to use local cognee: 1. Uncomment the fol...
(IN_ORDER_TO_PREMIUM)
🪛 markdownlint-cli2 (0.17.2)
cognee-starter-kit/README.md
17-17: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
21-21: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
28-28: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
43-43: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
51-51: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
59-59: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
67-67: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
75-75: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
84-84: Bare URL used
null
(MD034, no-bare-urls)
86-86: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
cognee-mcp/README.md
52-52: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
56-56: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
60-60: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
64-64: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
68-68: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
72-72: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
76-76: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
80-80: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
109-109: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
125-125: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
169-169: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
175-175: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: End-to-End Tests / S3 Bucket Test
- GitHub Check: End-to-End Tests / Server Start Test
- GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
- GitHub Check: Basic Tests / Run Unit Tests
🔇 Additional comments (217)
cognee/modules/data/processing/document_types/open_data_file.py (1)
11-12: LGTM! Good lazy import optimization.Moving the
s3fsimport inside the conditional block is an excellent optimization. This ensures the module is only imported when S3 credentials are available and S3 functionality is actually needed, reducing startup time and avoiding import errors whens3fsisn't installed.cognee-starter-kit/.env.template (1)
1-19: LGTM! Well-structured environment template.The environment template provides clear configuration options for LLM and embedding providers with sensible OpenAI defaults. The grouping of related variables (API_KEY, MODEL, PROVIDER followed by optional ENDPOINT/VERSION) is more intuitive than alphabetical ordering suggested by the linter.
cognee-starter-kit/.gitignore (1)
1-196: LGTM! Comprehensive and well-organized gitignore.This gitignore file provides excellent coverage for a Python project with potential Node.js components. The thoughtful handling of the logs directory (preserving structure while ignoring contents) and inclusion of project-specific directories like
.cognee_system/and.data_storage/shows good attention to the project's specific needs.cognee-frontend/.prettierignore (1)
1-3: LGTM! Standard and appropriate Prettier exclusions.The exclusions for
node_modules,dist, andcoverageare exactly what's needed for a frontend project to avoid formatting generated files and dependencies.cognee-frontend/src/app/auth/page.tsx (1)
1-1: LGTM! Stylistic improvement for consistency.The change from single to double quotes aligns with code formatting standards and likely supports the new Prettier configuration added to the project.
README.md (1)
42-48: LGTM! Clean formatting improvement.Removing trailing spaces after pipe characters improves the consistency and cleanliness of the markdown formatting in the language links section.
cognee-frontend/src/modules/chat/api/getHistory.ts (2)
1-1: LGTM! Consistent quote style.The change from single to double quotes in the import statement improves code style consistency.
5-5: LGTM! Consistent quote style.The change from single to double quotes for the API endpoint string maintains consistency with the import statement above.
cognee/infrastructure/databases/utils/__init__.py (1)
1-1: LGTM! Proper package-level function exposure.The import correctly exposes
get_or_create_dataset_databaseat the package level, enabling easier access from other modules. The static analysis warning about unused import is a false positive since this is a deliberate API exposure pattern.cognee/modules/users/exceptions/__init__.py (1)
12-12: LGTM! Consistent exception exposure pattern.Adding
PermissionNotFoundErrorto the imported exceptions maintains consistency with other user-related exceptions exposed by this package. The static analysis warning is a false positive since this follows the established pattern for exception API exposure.cognee/modules/graph/methods/__init__.py (1)
1-1: LGTM! Valid re-export in init.py file.The static analysis warning about unused import is a false positive. This import serves as a re-export in the package's
__init__.pyfile, makingget_formatted_graph_dataavailable at the package level for use in API endpoints (as mentioned in the AI summary).cognee-frontend/src/utils/index.ts (1)
1-3: LGTM! Good standardization and utility addition.The changes improve code consistency by standardizing on double quotes and add the new
useBooleanutility hook to the exports. This aligns with the frontend modernization mentioned in the AI summary.cognee/api/v1/delete/routers/get_delete_router.py (1)
17-17: LGTM! Consistent route path standardization.Removing the trailing slash from the route decorator improves consistency across API routers, as mentioned in the AI summary. This helps avoid potential routing ambiguities between
/and `` paths.cognee/exceptions/exceptions.py (1)
34-36: LGTM! Excellent addition of string representation for exceptions.Adding the
__str__method to the base exception class follows Python best practices and provides consistent, human-readable error messages. The format matches the logging output in the__init__method, ensuring consistency across the codebase.cognee-frontend/Dockerfile (4)
2-2: LGTM! Good Node.js version upgrade.Upgrading from Node.js 18 to 22 brings performance improvements, security updates, and access to newer JavaScript features. This aligns with the frontend modernization efforts.
12-12: LGTM! Reasonable to disable problematic build step.Commenting out the
npm rebuild lightningcsscommand suggests it may have been causing build issues or is no longer necessary with the updated dependencies and Node.js version.
18-18: LGTM! PostCSS configuration for Tailwind CSS support.Adding the PostCSS configuration file copy aligns with the Tailwind CSS integration mentioned in the AI summary, supporting the new styling approach in the frontend.
22-22: LGTM! Improved CMD instruction format.Using the exec form
["npm", "run", "dev"]instead of shell form is a Docker best practice. It ensures proper signal handling and process management within the container.cognee-frontend/src/app/auth/signup/page.tsx (1)
1-1: LGTM! Standard Next.js routing pattern.This follows the established Next.js convention for page routing where
page.tsxserves as the entry point and delegates to the actual page component.cognee/api/v1/cognify/code_graph_pipeline.py (3)
5-5: Good addition for consistent logging setup.Adding
setup_loggingto the imports aligns with the logging standardization across the codebase.
100-100: Proper logging setup in main execution block.Good practice to set up logging before running the main async function.
46-46: Verify the significant batch size reduction.The batch size was reduced from 500 to 30 (94% reduction), which is a substantial change that could significantly impact pipeline performance. Please ensure this change is intentional and addresses a specific issue.
#!/bin/bash # Description: Check for any comments or documentation about batch size changes # Expected: Find context about why batch size was reduced # Search for any TODOs, comments, or documentation about batch size rg -i "batch.?size|memory|timeout|performance" --type py -A 3 -B 3 # Check if there are any related issues or configurations rg "add_data_points.*batch" --type py -A 2 -B 2cognee-frontend/.prettierrc (1)
1-7: Well-configured Prettier setup with Tailwind CSS support.The configuration includes sensible defaults and the
prettier-plugin-tailwindcssplugin for automatic Tailwind class sorting, which will help maintain consistent code formatting across the frontend.cognee/api/v1/users/routers/get_visualize_router.py (1)
11-11: Route path standardization looks good.Removing the explicit trailing slash from the route decorator aligns with the broader API routing consistency improvements mentioned in the PR.
cognee-frontend/src/app/auth/login/page.tsx (1)
1-1: LGTM! Consistent with Next.js routing conventions.This mirrors the same pattern used in the signup page and follows standard Next.js routing practices.
cognee/modules/search/operations/get_history.py (1)
19-22: LGTM! Good query optimization.The changes improve query construction by:
- Moving
order_bybeforelimitensures proper ordering of the union results- Adding the conditional check prevents unnecessary
LIMIT 0clauses when limit is not positive- Improves overall query performance and correctness
cognee/modules/users/authentication/api_bearer/__init__.py (1)
1-2: LGTM! Standard package re-export pattern.This follows the standard Python pattern for making submodule components accessible at the package level. The static analysis warnings about unused imports are false positives - these imports are intentionally re-exported for convenient package-level access.
cognee-frontend/src/modules/ingestion/addData.ts (1)
1-19: LGTM! Good code standardization.The changes standardize string literals to use double quotes, improving code consistency across the frontend codebase. No functional changes were made - the ingestion logic remains correct.
cognee/modules/pipelines/utils/__init__.py (1)
1-2: LGTM! Clean utility re-exports.This follows standard Python package patterns for re-exporting utility functions at the package level. The static analysis warnings about unused imports are false positives - these functions are intended for package-level access and are used elsewhere in the pipeline system.
cognee/api/v1/settings/routers/get_settings_router.py (2)
49-49: LGTM: Route path consistency improvement.Removing trailing slashes from route decorators improves consistency across the API endpoints.
Also applies to: 55-55
50-50: Static analysis false positive - FastAPI dependency injection pattern is correct.The static analysis tool flagged the
Depends(get_authenticated_user)usage in the function parameter default, but this is the standard and correct FastAPI pattern for dependency injection. TheDepends()call needs to be in the parameter default to work properly with FastAPI's dependency injection system.cognee-frontend/src/ui/App/Loading/DefaultLoadingIndicator/LoadingIndicator.module.css (1)
3-4: Good improvement: Using relative units for better responsiveness.Changing from fixed
16pxto1remunits makes the loading indicator more responsive and accessible, scaling with user font size preferences.cognee/modules/users/methods/__init__.py (1)
5-5: ```shell
#!/bin/bashCheck if all is defined in the re-export init.py
rg "all" -n cognee/modules/users/methods/init.py
</details> <details> <summary>cognee-frontend/src/ui/Icons/GitHubIcon.tsx (1)</summary> `6-6`: **Excellent improvement: Dynamic color support enhances reusability.** Changing from hardcoded `"#fdfdfd"` to the dynamic `{color}` prop makes the GitHubIcon component much more flexible and reusable across different contexts and themes. This aligns with good React component design principles. </details> <details> <summary>cognee-frontend/src/app/layout.tsx (1)</summary> `8-8`: **Good branding update.** Updating the title from the default "Create Next App" to "Cognee" properly brands the application. </details> <details> <summary>cognee-frontend/postcss.config.mjs (1)</summary> `1-5`: **Verify the Tailwind CSS PostCSS plugin name.** The plugin name `@tailwindcss/postcss` appears unusual. Typically, the Tailwind CSS PostCSS plugin is referenced as `tailwindcss`. Please verify that this plugin name is correct. If it should be the standard Tailwind CSS plugin, apply this diff: ```diff - "@tailwindcss/postcss": {}, + "tailwindcss": {},What is the correct PostCSS plugin name for Tailwind CSS in 2024?cognee/modules/pipelines/operations/log_pipeline_run_start.py (2)
7-7: LGTM! Good addition of the utility import.The import for
generate_pipeline_run_idaligns with the deterministic ID generation approach mentioned in the PR summary.
18-18: LGTM! Improved pipeline run ID generation.The change from
uuid4()togenerate_pipeline_run_id(pipeline_id, dataset_id)provides deterministic and consistent pipeline run identification, which improves traceability and aligns with the broader pipeline refactoring efforts.cognee/modules/retrieval/utils/brute_force_triplet_search.py (1)
128-129: LGTM! Good documentation improvement.The added parameter descriptions for
node_typeandnode_nameimprove the function's documentation clarity and align with the enhanced filtering capabilities mentioned in the PR summary.cognee/eval_framework/analysis/metrics_calculator.py (1)
41-53: LGTM! Excellent defensive programming improvement.The None check prevents failed evaluations from corrupting the metrics data and details. This enhances the robustness of the evaluation framework and aligns well with the retry logic improvements mentioned in the PR summary.
cognee/infrastructure/databases/vector/lancedb/LanceDBAdapter.py (1)
234-236: Good defensive programming practice.This safeguard correctly handles the edge case where a collection is empty, preventing LanceDB from failing when
limitis 0. The early return with an empty list is both efficient and prevents runtime errors.cognee/api/v1/visualize/visualize.py (2)
5-5: Logging import updated for consistency.Good addition of
setup_loggingto the import statement to support the improved logging configuration.
31-31: Improved logging initialization.The switch from
get_logger(level=ERROR)tosetup_logging(log_level=ERROR)aligns with the broader logging standardization effort across the codebase and provides more comprehensive logging configuration.cognee-frontend/src/app/globals.css (1)
20-20: Good addition for full-height layout.Adding
height: 100%to the html and body selectors supports full-height layouts, which is useful for modern web applications.cognee/modules/users/authentication/api_bearer/api_bearer_transport.py (1)
1-8: LGTM! Clean implementation of API bearer transport.The bearer transport configuration follows FastAPI Users patterns correctly. The token URL endpoint and transport naming are appropriate for the API authentication flow.
cognee/infrastructure/databases/vector/get_vector_engine.py (2)
1-1: LGTM! Proper migration to context-aware configuration.The import change to
get_vectordb_context_configaligns with the new context-aware database configuration system for multi-tenant support.
6-7: LGTM! Clean implementation with helpful documentation.The function simplification and explanatory comment clearly communicate the context-based configuration selection. The direct parameter unpacking is efficient and appropriate.
cognee/modules/data/models/Dataset.py (1)
22-23: LGTM! Proper ACL relationship implementation.The bidirectional relationship with cascade delete ensures proper cleanup of ACL records when datasets are removed. This correctly implements the dataset-scoped permission system.
cognee/modules/users/authentication/api_bearer/api_jwt_strategy.py (1)
4-5: Consider future extensibility for API-specific JWT logic.The current implementation is a simple identity subclass of
JWTStrategy. While this serves as a type identifier for the API authentication backend, consider whether API-specific JWT handling (like different token lifetimes, claims, or validation rules) might be needed in the future.Note: The Pylint warning about too few public methods is a false positive in this context, as this class serves an architectural purpose in the authentication system.
cognee/infrastructure/databases/graph/config.py (1)
110-118: LGTM! Well-implemented context-aware configuration.The function correctly implements the pattern for multi-tenant database configurations. The fallback to
get_graph_config().to_hashable_dict()ensures backward compatibility, and the documentation clearly explains the purpose and benefits.cognee/modules/pipelines/models/__init__.py (1)
2-8: LGTM! Static analysis warnings are false positives.The imports correctly expose the new pipeline run info classes as part of the module's public API. The static analysis warnings about unused imports are false positives - these classes are intended to be imported by consumers of this module.
cognee/api/v1/users/routers/get_auth_router.py (2)
2-2: LGTM! Authentication backend update aligns with the refactor.The change from
get_auth_backendtoget_client_auth_backendis consistent with the broader authentication refactor that introduces separate backends for API and client authentication.
6-6: Function call correctly updated to match the import.The function call is properly updated to use the new client authentication backend.
cognee/modules/users/methods/get_user.py (2)
3-3: Good improvement: selectinload is more efficient for multiple relationships.The change from
joinedloadtoselectinloadis a good optimization when loading multiple relationships (rolesandtenant), as it avoids potential cartesian products.
22-24: Excellent error handling improvement.Adding explicit error handling with
EntityNotFoundErroris much better than returningNone. This provides clearer error semantics and better debugging information.cognee-frontend/src/utils/useBoolean.ts (1)
1-14: LGTM! Well-implemented React hook following best practices.The
useBooleanhook is cleanly implemented with proper use ofuseStateanduseCallback. The memoized setter functions with empty dependency arrays are correctly implemented to prevent unnecessary re-renders.cognee-frontend/src/ui/Partials/SettingsModal/Settings.tsx (1)
1-201: Clarify if this commenting out is temporary or permanent.The entire Settings component has been commented out, removing all configuration UI functionality. While this aligns with the backend changes introducing
ENABLE_BACKEND_ACCESS_CONTROL, consider:
- If this is a permanent removal, delete the commented code to improve maintainability
- If this is temporary, add a comment explaining the reason and timeline for restoration
Is this commenting out temporary or should the code be completely removed?
cognee-frontend/eslint.config.mjs (1)
1-16: Well-structured ESLint configuration following modern best practices.The configuration properly handles ES modules, integrates with Next.js and TypeScript, and includes Prettier for consistent formatting. The setup is clean and follows current ESLint standards.
cognee/modules/retrieval/exceptions/exceptions.py (1)
29-36: Well-implemented exception class following consistent patterns.The new
CollectionDistancesNotFoundErrorexception properly inherits fromCogneeApiError, uses an appropriate 404 status code, and provides a clear error message. The implementation is consistent with other exception classes in the module.cognee/modules/users/exceptions/exceptions.py (1)
51-58: Exception implementation follows consistent patterns.The
PermissionNotFoundErrorexception is well-implemented and consistent with other exception classes. The 403 status code is appropriate for permission-related errors, though you might consider whether 404 would be more semantically correct for "not found" scenarios versus "access denied" scenarios.cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1)
456-457: Good defensive programming practice.Ensuring the directory exists before truncating the SQLite database file prevents potential file operation failures. This change improves the robustness of the database deletion process.
cognee/modules/pipelines/utils/generate_pipeline_id.py (1)
4-5: LGTM! Clean deterministic UUID generation.The function correctly generates a deterministic UUID5 based on user ID and pipeline name, which enables consistent pipeline identification across the system.
cognee/modules/users/get_fastapi_users.py (1)
14-19: Good enhancement for flexible authentication.The dual-backend authentication setup enables more flexible authentication strategies. The implementation correctly instantiates both backends and passes them as a list to FastAPIUsers.
cognee-frontend/src/modules/auth/auth0.ts (1)
1-8: LGTM! Clean Auth0 configuration.The Auth0Client setup follows standard practices with appropriate OAuth scopes and API audience configuration.
cognee/modules/retrieval/graph_completion_retriever.py (3)
11-13: Good addition of logging capabilities.Adding structured logging will help with debugging and monitoring retrieval operations.
56-56: Improved node content fallback logic.The change to use "description" as the primary fallback before "name" provides more meaningful content when "text" is unavailable.
139-139: Helpful warning log for empty results.This warning will aid in debugging scenarios where queries return no relevant triplets.
cognee-frontend/src/ui/Icons/index.ts (1)
1-4: Clean addition of new icon exports.The new icon exports follow the established pattern and maintain consistency with existing exports.
cognee-frontend/src/ui/elements/Input.tsx (1)
1-8: Well-implemented reusable Input component.The component properly uses TypeScript typing, classNames for CSS merging, and follows React best practices with prop spreading. The Tailwind styling provides consistent appearance.
cognee/infrastructure/databases/vector/config.py (1)
67-73: Good implementation of context-aware configuration.The function properly supports context-specific vector database configurations with appropriate fallback to default settings. The design enables multi-tenant or dataset-scoped database access.
Verify that the context variable pattern is used consistently across the codebase:
#!/bin/bash # Description: Check usage of vector_db_config context variable across the codebase # Expected: Consistent usage pattern and proper context management echo "=== Searching for vector_db_config usage ===" rg -A 3 -B 3 "vector_db_config" echo -e "\n=== Searching for get_vectordb_context_config usage ===" rg -A 3 -B 3 "get_vectordb_context_config"cognee-frontend/src/app/auth/signup/SignUpPage.tsx (1)
1-31: Good implementation with room for minor improvementsThe component follows React and Next.js conventions well. The structure is clean and the use of the shared
AuthFormcomponent promotes reusability.cognee-frontend/src/modules/ingestion/useDatasets.ts (3)
41-41: Verify the polling interval increase is intentionalThe polling interval was increased from 5 seconds to 50 seconds (10x increase). This significant change could impact user experience by making status updates appear slower.
Please confirm this 10x increase in polling interval is intentional and aligns with backend performance requirements or rate limiting constraints.
18-30: Good error handling and async patternsThe fetchDatasetStatuses function properly handles async operations and maintains clean state management.
72-82: ```shell
#!/bin/bashInspect how FastAPI Users is configured to determine the auth backend and credential method
sed -n '1,200p' cognee/modules/users/methods/get_fastapi_users.py
</details> <details> <summary>cognee/infrastructure/databases/graph/kuzu/adapter.py (4)</summary> `49-54`: **Improved robustness for database initialization** The enhanced error handling for `FileExistsError` during database directory creation is a good improvement that handles edge cases where a file exists at the database path. --- `1052-1052`: **Good generalization of edge query** Changing from `MATCH (n:Node)-[r:EDGE]->(m:Node)` to `MATCH (n:Node)-[r]->(m:Node)` makes the query more flexible by matching any relationship type, not just those labeled `EDGE`. --- `46-83`: **Robust database initialization with proper error handling** The initialization method properly handles connection setup and schema creation with good error handling and logging. --- `1164-1167`: **Fix logical issue with edge appending** The edge appending logic has an indentation issue that could cause edges to not be added when JSON parsing fails. The current code only appends edges when JSON parsing succeeds, but edges should be appended regardless: ```diff if props: try: data = json.loads(props) except json.JSONDecodeError: logger.warning(f"Failed to parse JSON props for edge {from_id}->{to_id}") - edges.append((from_id, to_id, rel_type, data))The edge should always be appended, whether JSON parsing succeeds or fails.
Likely an incorrect or invalid review comment.
cognee/modules/users/methods/get_user_by_email.py (1)
9-21: Well-structured async database query functionThe function properly uses async patterns with SQLAlchemy, includes appropriate eager loading for related entities, and follows good practices for database access.
cognee-frontend/src/ui/elements/index.ts (1)
1-8: Well-organized barrel export fileThe index file follows good practices for organizing UI component exports with consistent naming and structure.
cognee-frontend/src/ui/Partials/index.ts (2)
2-2: Confirm the deprecation strategy for commented exports.The exports for
SettingsModalandExplorercomponents are commented out rather than completely removed. This suggests they may be temporarily disabled or deprecated.Please confirm whether these components should be permanently removed or if they're temporarily disabled for future re-enablement.
Also applies to: 5-5
6-6: LGTM: New FeedbackForm export added.The addition of the FeedbackForm export enhances the UI component library with user feedback functionality.
cognee-frontend/src/app/(graph)/getColorForNodeType.ts (1)
4-18: LGTM: Well-structured color mapping system.The node color mapping is well-organized and uses consistent Tailwind CSS colors. The variety of colors provides good visual distinction for different node types.
cognee/eval_framework/corpus_builder/task_getters/get_cascade_graph_tasks.py (1)
11-11: LGTM: Clean refactor to dataset-level permissions.The changes correctly update the permission checking from document-level to dataset-level, which aligns with the broader permission system refactor mentioned in the PR summary.
Also applies to: 34-34
cognee/modules/graph/utils/expand_with_nodes_and_edges.py (1)
78-78: Verify the category mapping logic.The replacement of
isinstancechecks with string comparisons may have incorrect mappings. Based on typical ontology patterns,Thing(individuals/instances) should map to"individuals"andThingClass(classes/types) should map to"classes".Please verify that the category mapping is correct by checking the
AttachedOntologyNodeimplementation:#!/bin/bash # Description: Verify the category mapping in AttachedOntologyNode # Expected: Find how Thing and ThingClass map to category values ast-grep --pattern 'class AttachedOntologyNode { $$$ }' rg -A 10 -B 5 "category.*classes|category.*individuals"Also applies to: 90-90, 159-159, 171-171
cognee/modules/users/models/User.py (2)
13-13: LGTM: Clean integration with FastAPI Users.The multiple inheritance from
SQLAlchemyBaseUserTableUUIDandPrincipalproperly integrates the User model with FastAPI Users while maintaining existing functionality.
50-50: Verify the default verification status.Setting
is_verified: bool = Trueby default means new users are automatically verified. Confirm this aligns with your authentication requirements.Consider if auto-verification is the intended behavior for your use case, or if it should default to
Falserequiring explicit verification.cognee/api/client.py (2)
71-71: Excellent security improvement!Restricting CORS origins from wildcard
"*"to specific localhost URLs significantly improves security by preventing unauthorized cross-origin requests from malicious websites.
199-199: Good logging setup enhancement.Adding
setup_logging()call in the main entry point ensures consistent logging configuration across the application, which aligns with the broader logging improvements mentioned in the AI summary.cognee/modules/users/methods/get_default_user.py (1)
37-39: Excellent exception handling refinement!The specific handling of
NoResultFoundexceptions with detailed error messages is a significant improvement over generic exception catching. This provides better error context and maintains proper exception chaining.cognee-starter-kit/src/pipelines/default.py (1)
8-72: Excellent pipeline demonstration script!The code is well-structured with proper async patterns, clear directory configuration, and comprehensive demonstration of Cognee features. The use of pathlib for cross-platform path handling and the logical flow from data ingestion to various search types makes this an excellent example for users.
cognee/modules/pipelines/operations/log_pipeline_run_initiated.py (2)
4-4: LGTM: Deterministic pipeline run ID generation.The change from
uuid4()togenerate_pipeline_run_id(pipeline_id, dataset_id)provides consistent, deterministic pipeline run identification which aligns with the broader system improvements for dataset-scoped operations.
9-9: LGTM: Consistent ID generation usage.The implementation correctly uses the new deterministic ID generation utility function.
cognee-frontend/src/ui/Partials/SearchView/SearchView.module.css (1)
1-7: LGTM: Clean message spacing implementation.The sibling combinator CSS rules provide appropriate vertical spacing between adjacent user and system messages. The implementation is clean and minimal.
cognee/modules/data/methods/get_unique_dataset_id.py (1)
6-9: LGTM: Flexible dataset ID handling with proper type safety.The function enhancement to accept both string and UUID inputs is well-implemented:
- Proper type annotation with
Union[str, UUID]- Early return optimization for UUID inputs
- Maintains backward compatibility for string dataset names
- Uses
isinstance()for reliable type checkingThis improvement aligns with the system-wide dataset UUID handling enhancements.
cognee/eval_framework/corpus_builder/corpus_builder_executor.py (1)
64-67: LGTM: Proper implementation of streaming pipeline execution.The change from direct
await cognee_pipeline(tasks=tasks)to async iteration correctly implements the new streaming pipeline run model:
- Assigns pipeline to variable without immediate await
- Uses
async forto iterate over yieldedrun_infoevents- Enables processing of intermediate pipeline results
This aligns with the broader architectural changes for streaming pipeline run results and background processing.
cognee/infrastructure/databases/graph/kuzu/show_remote_kuzu_stats.py (1)
28-29: Good use of proper resource cleanup.The
finallyblock ensures the adapter connection is properly closed even if queries fail.cognee/infrastructure/databases/vector/create_vector_engine.py (3)
46-46: Good fix for critical typo.The correction from
utl=vector_db_urltourl=vector_db_urlfixes a runtime error that would occur when using supported databases.
11-12: Appropriate default values for optional parameters.Using empty strings as defaults for
vector_db_portandvector_db_keymakes sense since the function handles missing credentials with proper error messages later.
10-12: Verify impact of parameter reordering.The parameter order change (moving
vector_db_urlto second position and adding defaults) is a breaking change for existing callers using positional arguments.Run this script to check for existing usages that might be affected:
#!/bin/bash # Search for calls to create_vector_engine to assess breaking change impact rg -A 3 "create_vector_engine\(" --type pycognee-frontend/src/middleware.ts (2)
5-17: Middleware is currently a pass-through placeholder.The middleware allows all requests without any processing. The commented Auth0 code suggests this is intentional preparation for future authentication.
Consider adding a TODO comment to clarify the intended timeline for enabling the Auth0 integration.
19-29: Well-configured matcher for Next.js static files.The matcher properly excludes Next.js static assets and common metadata files while applying to all other routes.
cognee-mcp/src/server.py (4)
7-7: LGTM: Improved logging import structure.The addition of
setup_loggingto the imports enhances the logging configuration capabilities.
77-77: LGTM: Corrected parameter name.The change from
nodeset="developer_rules"tonode_set=["developer_rules"]appears to correct the parameter name and structure.
94-94: LGTM: Improved encapsulation by removing global variable.Moving
log_fileretrieval from global scope to local function scope improves encapsulation and thread safety.Also applies to: 176-176, 238-238
459-459: LGTM: Consistent logging initialization.The change to use
setup_logging()provides consistent logging configuration across the application.cognee/__init__.py (1)
9-16: LGTM: Proper initialization sequence for environment and logging.The early loading of environment variables ensures that
LOG_LEVELand other configuration values are available when logging is initialized. Theoverride=Trueparameter ensures environment variables take precedence over any defaults.cognee-frontend/src/ui/Partials/Footer/Footer.tsx (3)
4-6: LGTM: Proper TypeScript interface definition.The
FooterPropsinterface provides good type safety for the optional children prop.
10-23: LGTM: Clean Tailwind CSS implementation.The refactor to use Tailwind utility classes provides a cleaner, more maintainable styling approach compared to CSS modules.
17-21: Verify icon color change aligns with design system.The icon colors were changed from white to black. Please ensure this change aligns with the overall design system and provides sufficient contrast in all usage contexts.
cognee-frontend/src/app/auth/layout.tsx (2)
6-9: LGTM: Proper Next.js metadata configuration.The metadata export follows Next.js 13+ app directory conventions correctly.
11-31: LGTM: Clean and well-structured layout component.The component provides good visual hierarchy with proper spacing, responsive layout using Tailwind classes, and follows React best practices for layout components.
cognee/modules/data/exceptions/exceptions.py (1)
25-42: LGTM! Well-structured exception classes.The new exception classes follow the established pattern and use appropriate HTTP status codes. The implementation is consistent with existing exceptions in the codebase.
Consider making the error messages more specific by allowing parameterization:
class DatasetNotFoundError(CogneeApiError): def __init__( self, - message: str = "Dataset not found.", + message: str = None, + dataset_id: str = None, name: str = "DatasetNotFoundError", status_code=status.HTTP_404_NOT_FOUND, ): + if message is None: + message = f"Dataset {dataset_id} not found." if dataset_id else "Dataset not found." super().__init__(message, name, status_code)cognee/modules/data/methods/__init__.py (1)
11-22: LGTM! New dataset method imports are properly organized.The new imports follow the established pattern and are well-categorized. The static analysis warnings about unused imports are false positives - these imports are meant to be re-exported for use by other modules.
The categorization (Get, Create, Check) makes the code more maintainable and follows the existing structure.
cognee-frontend/src/app/auth/login/LoginPage.tsx (2)
8-31: LGTM! Well-structured login page component.The component follows Next.js best practices with proper use of client-side rendering, Image optimization, and Link navigation. The integration with the AuthForm component provides good separation of concerns.
33-40: LGTM! Proper payload formatting for form-encoded API.The formatPayload function correctly converts the email/password data to URLSearchParams format, which aligns with typical form-encoded authentication endpoints that expect "username" and "password" fields.
cognee-starter-kit/src/pipelines/custom-model.py (2)
33-48: LGTM! Well-defined data models for graph representation.The DataPoint classes are properly structured for representing programming languages and their fields. The metadata configuration for index fields is appropriate for search optimization.
Note: The static analysis warnings about "too few public methods" are false positives - these are data classes/models, not traditional classes that need multiple methods.
65-89: LGTM! Comprehensive demonstration of search capabilities.The code effectively demonstrates multiple search types (graph completion, RAG completion, summaries, and chunks) with proper async/await patterns and clear output formatting.
cognee/modules/users/models/ACL.py (1)
18-22: LGTM! Improved permission architecture with dataset-scoped ACLs.The shift from data-level to dataset-level permissions is a good architectural decision that simplifies permission management. The CASCADE delete ensures proper cleanup when datasets are removed, and the relationship changes are consistent with the foreign key updates.
cognee-frontend/src/app/page.tsx (1)
1-4: Clean architectural refactor with appropriate dynamic rendering.The complete refactor from a complex dataset management UI to a simple re-export pattern is well-executed. The
dynamic = "force-dynamic"export is appropriate for a graph visualization that likely requires real-time data updates.cognee/eval_framework/evaluation/deep_eval_adapter.py (2)
10-13: Good addition of required imports for retry functionality.The time and logging imports are appropriately added to support the new retry mechanism.
75-75: Clean integration of retry mechanism.The modification to use the new
_calculate_metricmethod properly centralizes the retry logic and error handling.cognee/modules/users/models/DatasetDatabase.py (1)
7-19: Well-designed SQLAlchemy model with proper relationships and constraints.The model design follows best practices:
- Appropriate use of UUID types for security and uniqueness
- Proper foreign key relationships with CASCADE delete behavior
- Unique constraints on database names prevent conflicts
- Timezone-aware timestamps with proper defaults
The static analysis hint about "too few public methods" is a false positive for SQLAlchemy ORM models, which inherit necessary methods from the Base class.
Verify that the foreign key references exist in the database schema:
#!/bin/bash # Description: Verify that the referenced tables and columns exist # Expected: Find principals.id and datasets.id column definitions echo "Checking for principals table with id column:" rg -A 10 -B 5 "class.*Principal" --type py echo -e "\nChecking for datasets table with id column:" rg -A 10 -B 5 "class.*Dataset" --type pycognee-frontend/src/app/auth/AuthPage.tsx (1)
12-40: Well-structured conditional rendering with clean UI.The conditional rendering logic is clean and the UI structure using Tailwind CSS is appropriate. The session-based flow clearly handles both authenticated and unauthenticated states.
Verify the session object structure matches the expected format:
#!/bin/bash # Description: Check the Auth0 session structure and user properties # Expected: Find type definitions or documentation for session.user.name echo "Checking Auth0 session type definitions:" rg -A 15 -B 5 "interface.*Session|type.*Session" --type ts --type tsx echo -e "\nChecking auth0 module exports:" fd "auth0" --type f --exec cat {}cognee/infrastructure/databases/graph/neo4j_driver/neo4j_metrics_utils.py (2)
59-63: Excellent optimization using GDS stats procedure.The change from streaming all components to using
gds.wcc.statsis a significant performance improvement. The stats procedure directly returns the component count without requiring data transfer and manual aggregation.
183-187: Clean optimization for clustering coefficient calculation.Similar to the connected components optimization, using
gds.localClusteringCoefficient.statsis more efficient than streaming individual coefficients and computing the average manually. The direct stats approach reduces both network overhead and computation time.cognee/infrastructure/databases/utils/get_or_create_dataset_database.py (1)
14-30: Function signature and documentation look good.Clear parameter typing and comprehensive docstring explaining the atomic create-or-fetch behavior.
cognee/modules/users/authentication/default/default_jwt_strategy.py (1)
10-23: Clarify the implementation status of this class.This class appears to be a placeholder with all functionality commented out. Please clarify whether:
- This is work-in-progress and should be completed
- This is intentionally stubbed for future implementation
- The commented code should be removed entirely
If this is production code, consider adding a docstring or TODO comment to indicate the intended status.
cognee/infrastructure/databases/vector/pgvector/create_db_and_tables.py (1)
8-11: ```shell
#!/bin/bashSearch for where the context var is set to confirm the return type of get_vectordb_context_config
rg -C3 "vector_db_config.set" .
</details> <details> <summary>cognee/infrastructure/databases/graph/memgraph/memgraph_adapter.py (2)</summary> `617-617`: **Good standardization of method naming.** Renaming `get_neighbours` to `get_neighbors` improves consistency with American English spelling conventions used elsewhere in the codebase. --- `637-654`: **Well-implemented node retrieval methods.** The new `get_node` and `get_nodes` methods follow established patterns in the class and provide useful functionality for direct node access by ID(s). The implementations are clean and consistent: - Proper type hints with `Optional[Dict[str, Any]]` and `List[Dict[str, Any]]` - Consistent Cypher query patterns - Appropriate return value handling (None for missing single node, empty list for no matches) </details> <details> <summary>cognee/modules/ingestion/classify.py (2)</summary> `2-2`: **Good practice: Proper type annotation for optional dependency.** The change from `S3FileSystem` to `Any` for the `s3fs` parameter is appropriate when dealing with optional dependencies. --- `8-14`: **Approve the conditional import pattern.** The try/except import block at module level is a good practice for handling optional dependencies gracefully. </details> <details> <summary>cognee/modules/metrics/operations/get_pipeline_run_metrics.py (2)</summary> `6-6`: **Good refactoring: Updated import aligns with new architecture.** The import of `PipelineRunInfo` correctly reflects the new pipeline run streaming architecture. --- `33-40`: **Good implementation: Proper database query and caching logic.** The logic correctly checks for existing metrics before creating new ones, which provides good caching behavior and avoids unnecessary computation. </details> <details> <summary>cognee-frontend/src/ui/elements/Select.tsx (3)</summary> `1-4`: **Good imports: Proper TypeScript and utility imports.** The imports are well-structured, using appropriate TypeScript types and utility functions. --- `5-5`: **Excellent TypeScript usage: Proper interface extension.** The component correctly extends `SelectHTMLAttributes<HTMLSelectElement>` and uses proper destructuring with rest props. --- `7-18`: **Good implementation: Proper select element with styling.** The select element implementation is solid with proper class merging and prop spreading. </details> <details> <summary>cognee/modules/graph/methods/get_formatted_graph_data.py (1)</summary> `6-7`: **Good implementation: Proper context setup for database access control.** The function correctly sets up database context variables for dataset and user scoping. </details> <details> <summary>cognee/modules/data/methods/load_or_create_datasets.py (3)</summary> `9-16`: **Good function design: Clear documentation and type hints.** The function signature and docstring clearly explain the behavior and expectations. --- `30-32`: **Good error handling: Appropriate exception for missing UUIDs.** The logic correctly raises `DatasetNotFoundError` when a UUID identifier doesn't match any existing dataset, which is the expected behavior. --- `35-40`: **Verify unique ID generation doesn't cause race conditions.** Ensure that `get_unique_dataset_id` handles concurrent calls properly to avoid potential race conditions when multiple users create datasets with the same name simultaneously. ```shell #!/bin/bash # Search for the implementation of get_unique_dataset_id to verify its concurrency handling ast-grep --pattern 'async def get_unique_dataset_id($$$)'cognee-frontend/src/modules/chat/hooks/useChat.ts (1)
26-26: ```shell
#!/bin/bashLocate the useChat hook file
echo "File path:"
fd --max-depth 4 useChat.tsDisplay the hook signature and dataset parameter usage
echo -e "\nuseChat signature and parameters:"
rg --color=never -n "function useChat" --context 3 cognee-frontend/src/modules/chat/hooks/useChat.tsecho -e "\nOccurrences of 'dataset' in hook implementation:"
rg --color=never -n "\bdataset\b" --context 2 cognee-frontend/src/modules/chat/hooks/useChat.tsecho -e "\nOccurrences of 'datasets' in request payload:"
rg --color=never -n "datasets" --context 2 cognee-frontend/src/modules/chat/hooks/useChat.ts</details> <details> <summary>.env.template (2)</summary> `52-55`: **Excellent documentation for Docker Compose setup.** The addition of clear guidance for using Postgres with Docker Compose is very helpful and addresses a common deployment scenario. --- `74-81`: **Well-documented access control feature with appropriate defaults.** The comprehensive documentation of the `ENABLE_BACKEND_ACCESS_CONTROL` feature, including supported databases and implications, is excellent. The default `False` value is appropriately conservative for a security feature. </details> <details> <summary>cognee/modules/pipelines/queues/pipeline_run_info_queues.py (1)</summary> `33-37`: **Handle queue exceptions gracefully.** The `get_from_queue` function could raise `asyncio.QueueEmpty` exception when calling `get_nowait()` on an empty queue, though it's currently handled by the conditional check. The implementation correctly handles the empty queue case with the conditional check before calling `get_nowait()`. </details> <details> <summary>cognee-frontend/src/ui/elements/TextArea.tsx (1)</summary> `72-78`: **Potential infinite loop in useEffect.** The `useEffect` dependency on `value` combined with checking `textAreaText !== value` could create update cycles if the comparison fails due to whitespace or formatting differences. Add a ref to track the last programmatically set value: ```diff +const lastSetValueRef = useRef<string>(''); + useEffect(() => { const fakeTextAreaElement = fakeTextAreaRef.current; const textAreaText = fakeTextAreaElement?.innerText; - if (fakeTextAreaElement && textAreaText !== value && textAreaText !== placeholder) { + if (fakeTextAreaElement && lastSetValueRef.current !== value && textAreaText !== placeholder) { fakeTextAreaElement.innerText = value; + lastSetValueRef.current = value; } }, [value]);Likely an incorrect or invalid review comment.
cognee/modules/pipelines/models/PipelineRunInfo.py (1)
6-13: Well-designed base model with appropriate configuration.The
PipelineRunInfobase class effectively defines the core structure for pipeline run information. Thearbitrary_types_allowedconfiguration is appropriate for the flexiblepayloadfield.alembic/versions/ab7e313804ae_permission_system_rework.py (4)
27-48: Well-structured table definition helper function.The
_define_dataset_tablefunction correctly defines the dataset table schema without depending on application models. This is the recommended approach for Alembic migrations.
82-119: Robust permission management with proper error handling.The
_ensure_permissionfunction handles permission creation efficiently and includes proper fallback logic. The use ofbulk_insertis appropriate for single-row inserts in this context.
147-184: Solid migration logic with proper data handling.The upgrade function follows good practices:
- Drops and recreates the table cleanly
- Uses helper functions to avoid model dependencies
- Handles empty dataset case gracefully
- Uses bulk operations for performance
- Creates comprehensive permission sets (read, write, share, delete)
187-222: Complete and correct downgrade implementation.The downgrade function properly reverses the migration by recreating the data-based ACLs table and migrating permissions back. The logic mirrors the upgrade process appropriately.
cognee-frontend/src/app/(graph)/GraphView.tsx (4)
18-27: Well-defined TypeScript interfaces.The
GraphNodeandGraphDatainterfaces provide clear type definitions for the component's data structures.
36-48: Robust data change handler with proper validation.The
onDataChangecallback correctly handles both reset (null) and update scenarios with appropriate validation for empty data.
75-80: Smart re-render optimization using key prop.Using
data?.nodes.lengthas the key forGraphVisualizationensures the component re-renders when the data structure changes significantly. This is a good optimization technique.
101-102: Clean imperative API usage with proper null checks.The callback functions correctly use the non-null assertion operator (
!) which is safe here since the callbacks are only called from child components when the refs are guaranteed to be initialized.cognee/modules/pipelines/operations/run_tasks.py (5)
2-13: LGTM: Clean import organization for refactored functionality.The new imports are well-organized and support the transition to structured pipeline run events. The addition of user methods, pipeline utilities, and structured run info models aligns well with the refactoring objectives.
86-89: Excellent defensive programming and pipeline ID generation improvement.The addition of default user fetching provides good fallback behavior, and the switch from manual UUID5 generation to the
generate_pipeline_idutility function improves maintainability and consistency.
95-98: Great improvement: Structured pipeline run events.The transition from yielding raw pipeline run objects to structured
PipelineRunStartedevents provides a much cleaner API contract and better type safety for consumers of this function.
108-111: Consistent event structure for pipeline yields.The use of
PipelineRunYieldwrapper maintains consistency with the structured event approach, making it easier for consumers to handle different pipeline states.
113-125: Improved error handling with structured events.The error and completion handling now uses structured events (
PipelineRunCompleted,PipelineRunErrored) which provides better consistency and makes it easier for consumers to distinguish between different pipeline states. The error re-raising preserves the original exception while still providing structured feedback.cognee/context_global_variables.py (2)
12-13: Excellent use of ContextVar for async context isolation.The use of
ContextVarfor database configurations is the correct approach for maintaining different database contexts across async tasks and threads. This enables proper isolation for dataset-scoped access control.
51-67: Well-structured database configuration setup.The database configuration dictionaries are well-structured and the context variable setting provides proper isolation. The separation of vector and graph database configurations supports the multi-database architecture effectively.
cognee-frontend/src/app/(graph)/GraphVisualization.tsx (2)
161-167: Excellent force simulation configuration.The D3 force configuration with collision detection and charge forces is well-implemented. The parameters provide good balance between node separation and graph stability.
176-225: Good fallback mechanism for graph rendering.The conditional rendering with a fallback example graph provides a good user experience when no data is available. The example data clearly demonstrates the expected functionality.
cognee/api/v1/search/routers/get_search_router.py (4)
14-19: Excellent documentation and API design for dataset filtering.The comment clearly explains the distinction between dataset names (owner-scoped) and dataset UUIDs (permission-based access). The optional fields provide flexible search scoping while maintaining security boundaries.
33-33: Consistent route path normalization.Removing the trailing slash from route decorators improves consistency across the API. This change aligns with similar updates mentioned in other routers.
52-59: Excellent permission-aware search implementation.The addition of dataset parameters to the search function and the graceful handling of
PermissionDeniedErrorby returning an empty list instead of an error provides a good user experience while maintaining security.
36-36: Verify the impact of limit=0 parameter.The addition of
limit=0to theget_historycall may change the behavior significantly. Ensure this change is intentional and that the search history endpoint should return unlimited results.#!/bin/bash # Description: Check the get_history function implementation to understand the limit parameter behavior ast-grep --pattern 'def get_history($$$)'cognee/modules/users/methods/get_authenticated_user.py (1)
6-6: Excellent simplification using FastAPI Users.The migration from manual JWT handling to FastAPI Users' built-in
current_userdependency significantly improves maintainability, reduces potential security issues, and follows established patterns. Theactive=Trueparameter ensures only active users can authenticate.cognee-starter-kit/src/pipelines/low_level.py (2)
14-33: Consider adding validation methods to DataPoint classes.The static analysis correctly identifies that these classes have too few public methods, but this is acceptable for data models. However, consider adding validation methods if business logic requires it.
The data model structure is clean and appropriate for representing the company hierarchy with proper metadata for indexing.
77-125: LGTM! Well-structured async pipeline with proper error isolation.The main function properly sets up the Cognee environment, handles configuration, and orchestrates the pipeline execution. The use of async generators for pipeline status monitoring is appropriate.
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (3)
30-30: Good addition of BASE_LABEL constant for consistent node labeling.This constant ensures all nodes have a common base label, which is essential for implementing proper uniqueness constraints and consistent querying patterns.
51-52: Appropriate configuration to disable Neo4j notifications.Disabling notifications with
notifications_min_severity="OFF"is a good practice for production environments to reduce log noise.
54-60: Well-implemented initialization method with proper constraint creation.The initialize method correctly creates a uniqueness constraint on the
idproperty for the base label, which is essential for data integrity.cognee-mcp/README.md (1)
1-36: Excellent visual improvements and comprehensive structure.The new header with logo, badges, and centered layout significantly improves the visual appeal and professionalism of the documentation.
cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py (3)
54-81: Excellent error handling and logging in API request method.The
_make_requestmethod properly handles HTTP errors, logs detailed information, and provides good error context for debugging.
14-21: Well-implemented UUIDEncoder for JSON serialization.The custom JSON encoder properly handles UUID objects by converting them to strings, which is essential for API communication.
183-197: Good schema initialization pattern with proper error handling.The schema initialization logic correctly checks for existing schema and creates it only when needed, with appropriate error handling and logging.
cognee-frontend/types/d3-force-3d.d.ts (2)
10-20: Well-defined 3D simulation node interface.The
SimulationNodeDatum3Dinterface properly extends the base 2D interface with the necessary 3D position, velocity, and fixed position properties. The optional nature offx,fy, andfzis appropriate.
22-41: Comprehensive 3D force function declarations.The exported functions cover all the essential force types for 3D simulations, including center, many-body, link, collision, radial, and directional forces. The type signatures are consistent and properly typed.
cognee/api/v1/add/routers/get_add_router.py (1)
53-55: LGTM! Good improvement to return operation results.The addition of capturing and returning the
add_runresult improves the API by providing feedback about the operation. The use ofmodel_dump()properly serializes the result for JSON response.cognee-frontend/src/ui/Partials/SearchView/SearchView.tsx (1)
90-93: LGTM! Good form submission handling.The enter key handling with shift key detection for multiline input is well implemented. The use of
requestSubmit()ensures proper form validation.cognee/infrastructure/databases/graph/get_graph_engine.py (4)
5-5: LGTM! Context-aware configuration improves multi-tenancy.The change from
get_graph_configtoget_graph_context_configaligns with the PR's objective of implementing dataset-scoped database management. This enables per-request or per-dataset database configurations.Also applies to: 13-13
21-23: Good addition of async initialization pattern.The conditional async initialization allows adapters to perform setup operations without requiring all adapters to implement the pattern. This is a clean extension point.
35-39: Parameter reordering may break existing code.The reordering of parameters in
create_graph_engine(movinggraph_file_pathbefore connection parameters) could break existing callers that rely on positional arguments.#!/bin/bash # Description: Check for any direct calls to create_graph_engine that might be affected # Expected: Find calls using positional arguments that need updating rg -A 5 "create_graph_engine\(" --type py
114-125: I want to inspect the full adapter to confirm available methods and error handling. Let’s fetch its contents and look for any async methods:#!/bin/bash # Print the adapter file sed -n '1,200p' cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py # Check specifically for async methods rg "async def" -n cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.pycognee-frontend/src/app/(graph)/GraphControls.tsx (1)
147-153: LGTM! Proper cleanup of timeout in useEffect.The cleanup function properly clears the timeout to prevent memory leaks. The dependency array correctly includes all relevant variables.
cognee/modules/pipelines/operations/pipeline.py (5)
48-53: Verify context variable handling and documentation.The context variable setup is well-designed for async task isolation. The comment explains the purpose clearly.
90-92: Good error handling for dataset validation.The explicit check and custom exception provide clear feedback when no datasets are available for processing.
115-117: Ensure database context isolation per dataset.The context setup before task execution is crucial for backend access control. Good implementation.
166-169: Verify context parameter threading through run_tasks.Ensure the context parameter is properly handled throughout the task execution chain.
#!/bin/bash # Description: Verify context parameter is properly threaded through run_tasks # Check run_tasks function signature and usage ast-grep --pattern 'def run_tasks($$$, context=$_, $$$)' # Check if context is used within run_tasks implementation rg -A 20 "def run_tasks" | head -30
93-102: ```shell
#!/bin/bashSearch for all usages of run_pipeline to identify how it's consumed
rg -Hn "run_pipeline" -A 5 -B 5
Look for any asyncio.gather patterns that might expect full results
rg -Hn "asyncio.gather" -A 5 -B 5
Check for list() or any eager conversion around run_pipeline
rg -Hn "list(.*run_pipeline" -A 5 -B 5
rg -Hn "await run_pipeline" -A 5 -B 5</details> <details> <summary>cognee/modules/search/methods/search.py (2)</summary> `196-202`: **Good concurrent search implementation.** The asyncio.gather approach efficiently processes multiple datasets concurrently while maintaining context isolation. --- `187-193`: #!/bin/bash # Dump the full implementation of get_or_create_dataset_database for inspection sed -n '1,200p' cognee/infrastructure/databases/utils/get_or_create_dataset_database.py </details> <details> <summary>cognee/api/v1/permissions/routers/get_permissions_router.py (4)</summary> `19-19`: **FastAPI Depends usage is correct here.** The Ruff warning about B008 is a false positive in this context. FastAPI's dependency injection system requires Depends() in function signatures. --- `35-40`: **Consistent authentication enforcement.** Good that all endpoints now require authentication. The owner_id pattern is consistently applied. --- `52-60`: **New tenant membership endpoint looks good.** The endpoint follows the established pattern with proper authentication and ownership tracking. --- `23-28`: ```shell #!/bin/bash # Correctly retrieve implementation of get_specific_user_permission_datasets rg -A 50 "def get_specific_user_permission_datasets" cognee/modules/users/permissions/methodscognee/api/v1/cognify/cognify.py (3)
102-113: Good background processing implementation.The async task creation and queue management for background processing is well implemented. The StopAsyncIteration handling ensures graceful completion.
128-128: Verify permission system migration.The change from
check_permissions_on_documentstocheck_permissions_on_datasetaligns with the system-wide permission refactor.
100-105: ```shell
#!/bin/bash
set -eecho "■ Checking setup.py for python_requires"
rg "python_requires" -n setup.py || echo " (not found)"echo
echo "■ Checking pyproject.toml for requires-python"
rg "requires-python" -n pyproject.toml || echo " (not found)"echo
echo "■ Checking setup.cfg for requires-python"
rg "requires-python" -n setup.cfg || echo " (not found)"echo
echo "■ Inspecting GitHub Actions workflows for Python versions"
rg "python-version" -n .github/workflows || echo " (no workflows or no python-version key)"</details> <details> <summary>cognee/api/v1/cognify/routers/get_cognify_router.py (5)</summary> `45-48`: **Good input validation for dataset parameters.** The validation ensures at least one dataset identifier is provided, preventing empty processing requests. --- `96-121`: **Robust WebSocket loop with proper cleanup.** The infinite loop with queue polling, error handling, and cleanup on disconnect/completion is well implemented. The 2-second polling interval provides a good balance. --- `119-121`: **Good WebSocket disconnect handling.** Proper queue cleanup on WebSocket disconnect prevents memory leaks. --- `107-113`: **Verify graph data formatting performance.** The `get_formatted_graph_data` call in the WebSocket loop could be expensive. Consider caching or optimization for real-time updates. ```shell #!/bin/bash # Description: Check graph data formatting implementation for performance # Find get_formatted_graph_data implementation ast-grep --pattern 'def get_formatted_graph_data($$$)' # Look for any caching or optimization patterns rg -A 20 "get_formatted_graph_data"
67-88: ```shell
#!/bin/bashDisplay the implementation of get_authenticated_user to verify token validation logic
sed -n '1,200p' cognee/modules/users/methods/get_authenticated_user.py
</details> <details> <summary>cognee/modules/ontology/rdf_xml/OntologyResolver.py (6)</summary> `6-6`: **Import change looks good for rdflib migration.** The import correctly brings in the necessary rdflib components to replace owlready2 functionality. --- `17-34`: **Well-designed wrapper class for ontology nodes.** The `AttachedOntologyNode` class provides a clean abstraction that handles URI name extraction properly, supporting both "#" and "/" separators. The implementation is straightforward and the `__repr__` method aids debugging. --- `37-54`: **Good handling of optional ontology file loading.** The constructor properly handles cases where the ontology file doesn't exist by setting `self.graph = None` and logging an informative message. Error handling with proper exception chaining is also well implemented. --- `55-62`: **URI normalization logic is correct.** The `_uri_to_key` method properly extracts names from URIs and normalizes them consistently. The handling of both "#" and "/" separators covers the common URI patterns. --- `63-98`: **Refactored lookup building logic looks correct.** The method properly handles the case where `self.graph` is None and correctly identifies classes and individuals using RDF triple patterns. The logic for finding individuals by checking if their type is a known class is sound. --- `119-125`: **Category detection method is well-implemented.** The `_get_category` helper method provides a clean way to determine whether a URI represents a class or individual by checking against the lookup dictionaries. </details> <details> <summary>cognee/api/v1/datasets/routers/get_datasets_router.py (4)</summary> `50-69`: **Well-structured DTO classes for graph data.** The new Pydantic models (`GraphNodeDTO`, `GraphEdgeDTO`, `GraphDTO`, `DatasetCreationPayload`) provide clear structure for graph data and dataset creation requests. The use of `OutDTO` and `InDTO` base classes maintains consistency with the API design. --- `87-114`: **Dataset creation endpoint with proper permission handling.** The new endpoint correctly checks for existing datasets before creation and assigns full permissions to the creator. The implementation properly handles the database session and includes appropriate error handling. --- `74-86`: **Permission-based dataset retrieval is secure.** The modified `get_datasets` endpoint now properly filters datasets based on user permissions, which is a significant security improvement over the previous implementation. --- `125-125`: **Improved exception handling with specific error types.** The use of `DatasetNotFoundError` and `DataNotFoundError` instead of generic exceptions provides better error categorization and makes debugging easier. The error messages are also informative. Also applies to: 144-144, 149-149, 226-226, 232-234, 239-241 </details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| owner_id = Column(UUID, index=True) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Address schema design and migration concerns for the new owner_id column.
The addition of owner_id introduces important functionality, but several design aspects need clarification:
- Missing foreign key constraint - Should this reference
User.id? - Nullable specification - Is owner required or optional for tenants?
- Migration strategy - How will existing tenant records handle this new column?
- Missing relationship - Consider adding a relationship to the owner User.
Consider this enhanced implementation:
- owner_id = Column(UUID, index=True)
+ owner_id = Column(UUID, ForeignKey("users.id"), nullable=True, index=True)
+
+ # Relationship to owner user
+ owner = relationship("User", foreign_keys=[owner_id])Also ensure proper database migration scripts are in place for existing tenants.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| owner_id = Column(UUID, index=True) | |
| owner_id = Column(UUID, ForeignKey("users.id"), nullable=True, index=True) | |
| # Relationship to owner user | |
| owner = relationship("User", foreign_keys=[owner_id]) |
🤖 Prompt for AI Agents
In cognee/modules/users/models/Tenant.py around lines 14 to 15, the owner_id
column lacks a foreign key constraint, nullable specification, and a
relationship to the User model. Fix this by adding a ForeignKey constraint
referencing User.id, specify whether owner_id is nullable based on whether the
owner is required or optional, and add a SQLAlchemy relationship to the User
model for owner. Additionally, create and apply a database migration script to
add this column to existing tenant records, handling default values or nulls as
appropriate.
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| logger = setup_logging(log_level=ERROR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Misuse of setup_logging overriding logger
Assigning logger = setup_logging(...) likely masks the module‐level logger and assumes setup_logging returns a logger. Instead, call setup_logging(log_level=ERROR) to configure global logging, then use get_logger(level=ERROR) to retrieve the logger.
🤖 Prompt for AI Agents
In cognee/modules/retrieval/utils/description_to_codepart_search.py at line 147,
the code incorrectly assigns the result of setup_logging(log_level=ERROR) to the
variable logger, which masks the module-level logger and assumes setup_logging
returns a logger. To fix this, call setup_logging(log_level=ERROR) without
assignment to configure global logging, then separately call
get_logger(level=ERROR) to retrieve and assign the logger variable.
| def check_dataset_name(dataset_name: str): | ||
| if "." in dataset_name or " " in dataset_name: | ||
| raise ValueError("Dataset name cannot contain spaces or underscores") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inconsistency between validation logic and error message.
The function checks for spaces and periods in the dataset name, but the error message mentions "spaces or underscores". This inconsistency could confuse users.
Apply this diff to align the error message with the validation logic:
- raise ValueError("Dataset name cannot contain spaces or underscores")
+ raise ValueError("Dataset name cannot contain spaces or periods")Alternatively, if underscores should also be forbidden, update the validation logic:
- if "." in dataset_name or " " in dataset_name:
+ if "." in dataset_name or " " in dataset_name or "_" in dataset_name:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def check_dataset_name(dataset_name: str): | |
| if "." in dataset_name or " " in dataset_name: | |
| raise ValueError("Dataset name cannot contain spaces or underscores") | |
| def check_dataset_name(dataset_name: str): | |
| if "." in dataset_name or " " in dataset_name: | |
| raise ValueError("Dataset name cannot contain spaces or periods") |
🤖 Prompt for AI Agents
In cognee/modules/data/methods/check_dataset_name.py lines 1 to 3, the error
message incorrectly states that dataset names cannot contain spaces or
underscores, while the validation logic actually checks for spaces and periods.
To fix this, either update the error message to say "Dataset name cannot contain
spaces or periods" to match the current checks, or modify the validation logic
to also check for underscores if those should be forbidden as well.
| text-decoration: none; | ||
| } | ||
|
|
||
| @import "tailwindcss"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix @import positioning to comply with CSS standards.
The @import rule must be placed at the beginning of the CSS file (after any @charset or @layer rules) to be valid. Currently, it's positioned after other CSS rules, which could cause browsers to ignore it.
Move the import to the top of the file:
+@import "tailwindcss";
+
:root {
--max-width: 1100px;
--border-radius: 2px;
--font-mono: ui-monospace, Menlo, Monaco, "Cascadia Mono", "Segoe UI Mono",
"Roboto Mono", "Oxygen Mono", "Ubuntu Monospace", "Source Code Pro",
"Fira Mono", "Droid Sans Mono", "Courier New", monospace;
--button-padding: 14px 20px !important;
--button-border-radius: 100px !important;
--global-color-primary: #6510F4 !important;
--global-color-primary-active: #500cc5 !important;
--global-color-primary-text: white !important;
--global-color-secondary: #0DFF00 !important;
--global-background-default: #0D051C;
--textarea-default-color: #0D051C !important;
}
html,
body {
height: 100%;
max-width: 100vw;
overflow-x: hidden;
}
a {
color: inherit;
text-decoration: none;
}
-
-@import "tailwindcss";🧰 Tools
🪛 Biome (1.9.4)
[error] 30-30: This @import is in the wrong position.
Any @import rules must precede all other valid at-rules and style rules in a stylesheet (ignoring @charset and @layer), or else the @import rule is invalid.
Consider moving import position.
(lint/correctness/noInvalidPositionAtImportRule)
🤖 Prompt for AI Agents
In cognee-frontend/src/app/globals.css at line 30, the @import "tailwindcss";
rule is placed after other CSS rules, which violates CSS standards. Move this
@import statement to the very top of the file, before any other CSS rules,
except for any @charset or @layer rules if present, to ensure it is properly
recognized by browsers.
| async def get_pipeline_run(pipeline_run_id: UUID): | ||
| db_engine = get_relational_engine() | ||
|
|
||
| async with db_engine.get_async_session() as session: | ||
| query = select(PipelineRun).filter(PipelineRun.pipeline_run_id == pipeline_run_id) | ||
|
|
||
| return await session.scalar(query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add return type annotation and consider error handling.
The function is missing a return type annotation and lacks explicit error handling for potential database connection issues.
+from typing import Optional
+
-async def get_pipeline_run(pipeline_run_id: UUID):
+async def get_pipeline_run(pipeline_run_id: UUID) -> Optional[PipelineRun]:
+ """Retrieve a pipeline run by its UUID.
+
+ Args:
+ pipeline_run_id: The UUID of the pipeline run to retrieve
+
+ Returns:
+ The PipelineRun instance if found, None otherwise
+ """
db_engine = get_relational_engine()
async with db_engine.get_async_session() as session:
query = select(PipelineRun).filter(PipelineRun.pipeline_run_id == pipeline_run_id)
return await session.scalar(query)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def get_pipeline_run(pipeline_run_id: UUID): | |
| db_engine = get_relational_engine() | |
| async with db_engine.get_async_session() as session: | |
| query = select(PipelineRun).filter(PipelineRun.pipeline_run_id == pipeline_run_id) | |
| return await session.scalar(query) | |
| from typing import Optional | |
| async def get_pipeline_run(pipeline_run_id: UUID) -> Optional[PipelineRun]: | |
| """Retrieve a pipeline run by its UUID. | |
| Args: | |
| pipeline_run_id: The UUID of the pipeline run to retrieve | |
| Returns: | |
| The PipelineRun instance if found, None otherwise | |
| """ | |
| db_engine = get_relational_engine() | |
| async with db_engine.get_async_session() as session: | |
| query = select(PipelineRun).filter(PipelineRun.pipeline_run_id == pipeline_run_id) | |
| return await session.scalar(query) |
🤖 Prompt for AI Agents
In cognee/modules/pipelines/methods/get_pipeline_run.py around lines 9 to 15,
add a return type annotation to the get_pipeline_run function to specify the
expected return type, such as Optional[PipelineRun]. Additionally, wrap the
database query logic in a try-except block to catch and handle potential
database connection errors, logging or raising an appropriate exception to
ensure robust error handling.
| graph_metrics = await graph_engine.get_graph_metrics(include_optional) | ||
| metrics = GraphMetrics( | ||
| id=pipeline_run.pipeline_run_id, | ||
| num_tokens=await fetch_token_count(db_engine), | ||
| num_nodes=graph_metrics["num_nodes"], | ||
| num_edges=graph_metrics["num_edges"], | ||
| mean_degree=graph_metrics["mean_degree"], | ||
| edge_density=graph_metrics["edge_density"], | ||
| num_connected_components=graph_metrics["num_connected_components"], | ||
| sizes_of_connected_components=graph_metrics["sizes_of_connected_components"], | ||
| num_selfloops=graph_metrics["num_selfloops"], | ||
| diameter=graph_metrics["diameter"], | ||
| avg_shortest_path_length=graph_metrics["avg_shortest_path_length"], | ||
| avg_clustering=graph_metrics["avg_clustering"], | ||
| ) | ||
| existing_metrics = existing_metrics.scalars().first() | ||
|
|
||
| if existing_metrics: | ||
| metrics_for_pipeline_runs.append(existing_metrics) | ||
| else: | ||
| graph_metrics = await graph_engine.get_graph_metrics(include_optional) | ||
| metrics = GraphMetrics( | ||
| id=pipeline_run.pipeline_run_id, | ||
| num_tokens=await fetch_token_count(db_engine), | ||
| num_nodes=graph_metrics["num_nodes"], | ||
| num_edges=graph_metrics["num_edges"], | ||
| mean_degree=graph_metrics["mean_degree"], | ||
| edge_density=graph_metrics["edge_density"], | ||
| num_connected_components=graph_metrics["num_connected_components"], | ||
| sizes_of_connected_components=graph_metrics["sizes_of_connected_components"], | ||
| num_selfloops=graph_metrics["num_selfloops"], | ||
| diameter=graph_metrics["diameter"], | ||
| avg_shortest_path_length=graph_metrics["avg_shortest_path_length"], | ||
| avg_clustering=graph_metrics["avg_clustering"], | ||
| ) | ||
| metrics_for_pipeline_runs.append(metrics) | ||
| session.add(metrics) | ||
| metrics_for_pipeline_runs.append(metrics) | ||
| session.add(metrics) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for graph metrics retrieval.
Consider adding error handling for the graph engine operations, as network or database issues could cause failures.
else:
- graph_metrics = await graph_engine.get_graph_metrics(include_optional)
+ try:
+ graph_metrics = await graph_engine.get_graph_metrics(include_optional)
+ except Exception as e:
+ # Log the error and either re-raise or return None/empty metrics
+ # depending on your error handling strategy
+ raise RuntimeError(f"Failed to retrieve graph metrics: {e}") from e
+
metrics = GraphMetrics(📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| graph_metrics = await graph_engine.get_graph_metrics(include_optional) | |
| metrics = GraphMetrics( | |
| id=pipeline_run.pipeline_run_id, | |
| num_tokens=await fetch_token_count(db_engine), | |
| num_nodes=graph_metrics["num_nodes"], | |
| num_edges=graph_metrics["num_edges"], | |
| mean_degree=graph_metrics["mean_degree"], | |
| edge_density=graph_metrics["edge_density"], | |
| num_connected_components=graph_metrics["num_connected_components"], | |
| sizes_of_connected_components=graph_metrics["sizes_of_connected_components"], | |
| num_selfloops=graph_metrics["num_selfloops"], | |
| diameter=graph_metrics["diameter"], | |
| avg_shortest_path_length=graph_metrics["avg_shortest_path_length"], | |
| avg_clustering=graph_metrics["avg_clustering"], | |
| ) | |
| existing_metrics = existing_metrics.scalars().first() | |
| if existing_metrics: | |
| metrics_for_pipeline_runs.append(existing_metrics) | |
| else: | |
| graph_metrics = await graph_engine.get_graph_metrics(include_optional) | |
| metrics = GraphMetrics( | |
| id=pipeline_run.pipeline_run_id, | |
| num_tokens=await fetch_token_count(db_engine), | |
| num_nodes=graph_metrics["num_nodes"], | |
| num_edges=graph_metrics["num_edges"], | |
| mean_degree=graph_metrics["mean_degree"], | |
| edge_density=graph_metrics["edge_density"], | |
| num_connected_components=graph_metrics["num_connected_components"], | |
| sizes_of_connected_components=graph_metrics["sizes_of_connected_components"], | |
| num_selfloops=graph_metrics["num_selfloops"], | |
| diameter=graph_metrics["diameter"], | |
| avg_shortest_path_length=graph_metrics["avg_shortest_path_length"], | |
| avg_clustering=graph_metrics["avg_clustering"], | |
| ) | |
| metrics_for_pipeline_runs.append(metrics) | |
| session.add(metrics) | |
| metrics_for_pipeline_runs.append(metrics) | |
| session.add(metrics) | |
| else: | |
| try: | |
| graph_metrics = await graph_engine.get_graph_metrics(include_optional) | |
| except Exception as e: | |
| # Log the error and either re-raise or return None/empty metrics | |
| # depending on your error handling strategy | |
| raise RuntimeError(f"Failed to retrieve graph metrics: {e}") from e | |
| metrics = GraphMetrics( | |
| id=pipeline_run.pipeline_run_id, | |
| num_tokens=await fetch_token_count(db_engine), | |
| num_nodes=graph_metrics["num_nodes"], | |
| num_edges=graph_metrics["num_edges"], | |
| mean_degree=graph_metrics["mean_degree"], | |
| edge_density=graph_metrics["edge_density"], | |
| num_connected_components=graph_metrics["num_connected_components"], | |
| sizes_of_connected_components=graph_metrics["sizes_of_connected_components"], | |
| num_selfloops=graph_metrics["num_selfloops"], | |
| diameter=graph_metrics["diameter"], | |
| avg_shortest_path_length=graph_metrics["avg_shortest_path_length"], | |
| avg_clustering=graph_metrics["avg_clustering"], | |
| ) | |
| metrics_for_pipeline_runs.append(metrics) | |
| session.add(metrics) |
🤖 Prompt for AI Agents
In cognee/modules/metrics/operations/get_pipeline_run_metrics.py around lines 41
to 58, the code calls graph_engine.get_graph_metrics without error handling,
which could cause unhandled exceptions if network or database issues occur. Wrap
the call to get_graph_metrics and subsequent processing in a try-except block to
catch potential exceptions, log the error details, and handle the failure
gracefully, such as by skipping the current pipeline run or returning a default
value.
| graph_client = await get_graph_engine() | ||
| (nodes, edges) = await graph_client.get_graph_data() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for database operations.
Consider adding error handling for the graph engine operations, as they could fail due to network or database issues.
- graph_client = await get_graph_engine()
- (nodes, edges) = await graph_client.get_graph_data()
+ try:
+ graph_client = await get_graph_engine()
+ (nodes, edges) = await graph_client.get_graph_data()
+ except Exception as e:
+ raise RuntimeError(f"Failed to retrieve graph data: {e}") from e📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| graph_client = await get_graph_engine() | |
| (nodes, edges) = await graph_client.get_graph_data() | |
| try: | |
| graph_client = await get_graph_engine() | |
| (nodes, edges) = await graph_client.get_graph_data() | |
| except Exception as e: | |
| raise RuntimeError(f"Failed to retrieve graph data: {e}") from e |
🤖 Prompt for AI Agents
In cognee/modules/graph/methods/get_formatted_graph_data.py around lines 9 to
10, the calls to get_graph_engine() and graph_client.get_graph_data() lack error
handling, which could cause unhandled exceptions if network or database issues
occur. Wrap these calls in a try-except block to catch potential exceptions, log
or handle the errors appropriately, and ensure the function can fail gracefully
or retry as needed.
| if datasets: | ||
| # Function handles transforming dataset input to dataset IDs (if possible) | ||
| dataset_ids = await get_dataset_ids(datasets, user) | ||
| # If dataset_ids are provided filter these datasets based on what user has permission for. | ||
| if dataset_ids: | ||
| existing_datasets = await get_specific_user_permission_datasets( | ||
| user.id, permission_type, dataset_ids | ||
| ) | ||
| else: | ||
| existing_datasets = [] | ||
| else: | ||
| # If no datasets are provided, work with all existing datasets user has permission for. | ||
| existing_datasets = await get_all_user_permission_datasets(user, permission_type) | ||
|
|
||
| return existing_datasets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for helper function calls.
The function doesn't handle potential exceptions from get_dataset_ids, get_specific_user_permission_datasets, or get_all_user_permission_datasets, which could cause unhandled errors to propagate.
if datasets:
# Function handles transforming dataset input to dataset IDs (if possible)
- dataset_ids = await get_dataset_ids(datasets, user)
+ try:
+ dataset_ids = await get_dataset_ids(datasets, user)
+ except Exception as e:
+ # Log error and return empty list or re-raise depending on requirements
+ return []
# If dataset_ids are provided filter these datasets based on what user has permission for.
if dataset_ids:
- existing_datasets = await get_specific_user_permission_datasets(
- user.id, permission_type, dataset_ids
- )
+ try:
+ existing_datasets = await get_specific_user_permission_datasets(
+ user.id, permission_type, dataset_ids
+ )
+ except Exception as e:
+ return []
else:
existing_datasets = []
else:
# If no datasets are provided, work with all existing datasets user has permission for.
- existing_datasets = await get_all_user_permission_datasets(user, permission_type)
+ try:
+ existing_datasets = await get_all_user_permission_datasets(user, permission_type)
+ except Exception as e:
+ return []📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if datasets: | |
| # Function handles transforming dataset input to dataset IDs (if possible) | |
| dataset_ids = await get_dataset_ids(datasets, user) | |
| # If dataset_ids are provided filter these datasets based on what user has permission for. | |
| if dataset_ids: | |
| existing_datasets = await get_specific_user_permission_datasets( | |
| user.id, permission_type, dataset_ids | |
| ) | |
| else: | |
| existing_datasets = [] | |
| else: | |
| # If no datasets are provided, work with all existing datasets user has permission for. | |
| existing_datasets = await get_all_user_permission_datasets(user, permission_type) | |
| return existing_datasets | |
| if datasets: | |
| # Function handles transforming dataset input to dataset IDs (if possible) | |
| try: | |
| dataset_ids = await get_dataset_ids(datasets, user) | |
| except Exception as e: | |
| # Log error and return empty list or re-raise depending on requirements | |
| return [] | |
| # If dataset_ids are provided filter these datasets based on what user has permission for. | |
| if dataset_ids: | |
| try: | |
| existing_datasets = await get_specific_user_permission_datasets( | |
| user.id, permission_type, dataset_ids | |
| ) | |
| except Exception as e: | |
| return [] | |
| else: | |
| existing_datasets = [] | |
| else: | |
| # If no datasets are provided, work with all existing datasets user has permission for. | |
| try: | |
| existing_datasets = await get_all_user_permission_datasets(user, permission_type) | |
| except Exception as e: | |
| return [] | |
| return existing_datasets |
🤖 Prompt for AI Agents
In cognee/modules/data/methods/get_authorized_existing_datasets.py around lines
25 to 39, add try-except blocks around the calls to get_dataset_ids,
get_specific_user_permission_datasets, and get_all_user_permission_datasets to
catch potential exceptions. Handle these exceptions appropriately, such as
logging the error and returning an empty list or a suitable fallback, to prevent
unhandled errors from propagating and crashing the function.
| export default async function cognifyDataset(dataset: Dataset, onUpdate: (data: GraphData) => void) { | ||
| // const data = await ( | ||
| return fetch("/v1/cognify", { | ||
| method: "POST", | ||
| headers: { | ||
| 'Content-Type': 'application/json', | ||
| "Content-Type": "application/json", | ||
| }, | ||
| body: JSON.stringify({ | ||
| datasets: [dataset.id || dataset.name], | ||
| datasetIds: [dataset.id], | ||
| runInBackground: false, | ||
| }), | ||
| }).then((response) => response.json()); | ||
| }) | ||
| .then((response) => response.json()) | ||
| .then(() => { | ||
| return getDatasetGraph(dataset) | ||
| .then((data) => { | ||
| onUpdate({ | ||
| nodes: data.nodes, | ||
| edges: data.edges, | ||
| }); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling and improve return consistency.
The function lacks error handling and has an inconsistent return pattern that could cause issues if the API call or graph fetching fails.
-export default async function cognifyDataset(dataset: Dataset, onUpdate: (data: GraphData) => void) {
- // const data = await (
- return fetch("/v1/cognify", {
+export default async function cognifyDataset(dataset: Dataset, onUpdate: (data: GraphData) => void): Promise<void> {
+ if (!dataset?.id) {
+ throw new Error("Dataset ID is required");
+ }
+
+ try {
+ await fetch("/v1/cognify", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
datasetIds: [dataset.id],
runInBackground: false,
}),
- })
- .then((response) => response.json())
- .then(() => {
- return getDatasetGraph(dataset)
- .then((data) => {
- onUpdate({
- nodes: data.nodes,
- edges: data.edges,
- });
- });
- });
- // )
+ });
+
+ const data = await getDatasetGraph(dataset);
+ onUpdate({
+ nodes: data.nodes,
+ edges: data.edges,
+ });
+ } catch (error) {
+ console.error("Failed to cognify dataset:", error);
+ throw error;
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export default async function cognifyDataset(dataset: Dataset, onUpdate: (data: GraphData) => void) { | |
| // const data = await ( | |
| return fetch("/v1/cognify", { | |
| method: "POST", | |
| headers: { | |
| 'Content-Type': 'application/json', | |
| "Content-Type": "application/json", | |
| }, | |
| body: JSON.stringify({ | |
| datasets: [dataset.id || dataset.name], | |
| datasetIds: [dataset.id], | |
| runInBackground: false, | |
| }), | |
| }).then((response) => response.json()); | |
| }) | |
| .then((response) => response.json()) | |
| .then(() => { | |
| return getDatasetGraph(dataset) | |
| .then((data) => { | |
| onUpdate({ | |
| nodes: data.nodes, | |
| edges: data.edges, | |
| }); | |
| }); | |
| }); | |
| export default async function cognifyDataset( | |
| dataset: Dataset, | |
| onUpdate: (data: GraphData) => void | |
| ): Promise<void> { | |
| if (!dataset?.id) { | |
| throw new Error("Dataset ID is required"); | |
| } | |
| try { | |
| await fetch("/v1/cognify", { | |
| method: "POST", | |
| headers: { | |
| "Content-Type": "application/json", | |
| }, | |
| body: JSON.stringify({ | |
| datasetIds: [dataset.id], | |
| runInBackground: false, | |
| }), | |
| }); | |
| const data = await getDatasetGraph(dataset); | |
| onUpdate({ | |
| nodes: data.nodes, | |
| edges: data.edges, | |
| }); | |
| } catch (error) { | |
| console.error("Failed to cognify dataset:", error); | |
| throw error; | |
| } | |
| } |
🤖 Prompt for AI Agents
In cognee-frontend/src/modules/datasets/cognifyDataset.ts around lines 10 to 31,
the function lacks error handling and returns inconsistently, which can cause
issues if the fetch or graph retrieval fails. Add try-catch blocks or use
promise catch handlers to handle errors from both the fetch call and
getDatasetGraph. Also, ensure the function consistently returns a value or a
promise, such as returning the final data or a status, to maintain predictable
behavior.
| export default function handleServerErrors(response: Response, retry?: (response: Response) => Promise<Response>): Promise<Response> { | ||
| return new Promise((resolve, reject) => { | ||
| if (response.status === 401) { | ||
| window.location.href = '/auth'; | ||
| return; | ||
| if (retry) { | ||
| return retry(response) | ||
| .catch(() => { | ||
| return redirect("/auth/login"); | ||
| }); | ||
| } else { | ||
| return redirect("/auth/login"); | ||
| } | ||
| } | ||
| if (!response.ok) { | ||
| return response.json().then(error => reject(error)); | ||
| } | ||
|
|
||
| return resolve(response); | ||
| if (response.status >= 200 && response.status < 300) { | ||
| return resolve(response); | ||
| } | ||
|
|
||
| return reject(response); | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix promise handling and simplify logic.
The function has several issues with promise handling and redundant logic that could cause unexpected behavior.
-export default function handleServerErrors(response: Response, retry?: (response: Response) => Promise<Response>): Promise<Response> {
- return new Promise((resolve, reject) => {
- if (response.status === 401) {
- if (retry) {
- return retry(response)
- .catch(() => {
- return redirect("/auth/login");
- });
- } else {
- return redirect("/auth/login");
- }
- }
- if (!response.ok) {
- return response.json().then(error => reject(error));
- }
-
- if (response.status >= 200 && response.status < 300) {
- return resolve(response);
- }
-
- return reject(response);
- });
-}
+export default async function handleServerErrors(response: Response, retry?: (response: Response) => Promise<Response>): Promise<Response> {
+ if (response.status === 401) {
+ if (retry) {
+ try {
+ return await retry(response);
+ } catch {
+ redirect("/auth/login");
+ return response; // This won't be reached due to redirect
+ }
+ } else {
+ redirect("/auth/login");
+ return response; // This won't be reached due to redirect
+ }
+ }
+
+ if (!response.ok) {
+ const error = await response.json();
+ throw error;
+ }
+
+ return response;
+}🤖 Prompt for AI Agents
In cognee-frontend/src/utils/handleServerErrors.ts lines 3 to 25, the promise
handling is inconsistent and the logic is redundant, which may cause unexpected
behavior. Refactor the function to avoid mixing explicit Promise construction
with returning promises inside it. Use async/await or chain promises properly to
handle the 401 status with retry logic, handle non-ok responses by rejecting
with parsed error JSON, and resolve or reject based on response status cleanly
without redundant checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review continued from previous batch...
| vector_db_config: dict = None, | ||
| graph_db_config: dict = None, | ||
| dataset_id: UUID = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add proper type hints and parameter validation.
The new parameters should be properly typed as Optional and validated.
user: User = None,
node_set: Optional[List[str]] = None,
- vector_db_config: dict = None,
- graph_db_config: dict = None,
- dataset_id: UUID = None,
+ vector_db_config: Optional[dict] = None,
+ graph_db_config: Optional[dict] = None,
+ dataset_id: Optional[UUID] = None,
):
+ # Validate that either dataset_name or dataset_id is provided
+ if not dataset_name and not dataset_id:
+ raise ValueError("Either dataset_name or dataset_id must be provided")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| vector_db_config: dict = None, | |
| graph_db_config: dict = None, | |
| dataset_id: UUID = None, | |
| user: User = None, | |
| node_set: Optional[List[str]] = None, | |
| vector_db_config: Optional[dict] = None, | |
| graph_db_config: Optional[dict] = None, | |
| dataset_id: Optional[UUID] = None, | |
| ): | |
| # Validate that either dataset_name or dataset_id is provided | |
| if not dataset_name and not dataset_id: | |
| raise ValueError("Either dataset_name or dataset_id must be provided") |
🤖 Prompt for AI Agents
In cognee/api/v1/add/add.py around lines 15 to 17, the parameters
vector_db_config, graph_db_config, and dataset_id lack proper Optional type
hints and validation. Update their type hints to use Optional from typing (e.g.,
Optional[dict] and Optional[UUID]) and add validation logic to check if the
parameters are None or of the expected type before use, raising appropriate
errors if validation fails.
| pipeline_run_info = None | ||
|
|
||
| async for run_info in cognee_pipeline( | ||
| tasks=tasks, | ||
| datasets=dataset_id if dataset_id else dataset_name, | ||
| data=data, | ||
| user=user, | ||
| pipeline_name="add_pipeline", | ||
| vector_db_config=vector_db_config, | ||
| graph_db_config=graph_db_config, | ||
| ): | ||
| pipeline_run_info = run_info | ||
|
|
||
| return pipeline_run_info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling and ensure pipeline_run_info is not None.
The async iteration lacks error handling and could return None if the pipeline yields no results.
pipeline_run_info = None
- async for run_info in cognee_pipeline(
- tasks=tasks,
- datasets=dataset_id if dataset_id else dataset_name,
- data=data,
- user=user,
- pipeline_name="add_pipeline",
- vector_db_config=vector_db_config,
- graph_db_config=graph_db_config,
- ):
- pipeline_run_info = run_info
+ try:
+ async for run_info in cognee_pipeline(
+ tasks=tasks,
+ datasets=dataset_id if dataset_id else dataset_name,
+ data=data,
+ user=user,
+ pipeline_name="add_pipeline",
+ vector_db_config=vector_db_config,
+ graph_db_config=graph_db_config,
+ ):
+ pipeline_run_info = run_info
+ except Exception as e:
+ # Log the error and re-raise with context
+ raise RuntimeError(f"Pipeline execution failed: {str(e)}") from e
- return pipeline_run_info
+ if pipeline_run_info is None:
+ raise RuntimeError("Pipeline completed but no run info was generated")
+
+ return pipeline_run_info📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pipeline_run_info = None | |
| async for run_info in cognee_pipeline( | |
| tasks=tasks, | |
| datasets=dataset_id if dataset_id else dataset_name, | |
| data=data, | |
| user=user, | |
| pipeline_name="add_pipeline", | |
| vector_db_config=vector_db_config, | |
| graph_db_config=graph_db_config, | |
| ): | |
| pipeline_run_info = run_info | |
| return pipeline_run_info | |
| pipeline_run_info = None | |
| try: | |
| async for run_info in cognee_pipeline( | |
| tasks=tasks, | |
| datasets=dataset_id if dataset_id else dataset_name, | |
| data=data, | |
| user=user, | |
| pipeline_name="add_pipeline", | |
| vector_db_config=vector_db_config, | |
| graph_db_config=graph_db_config, | |
| ): | |
| pipeline_run_info = run_info | |
| except Exception as e: | |
| # Log the error and re-raise with context | |
| raise RuntimeError(f"Pipeline execution failed: {str(e)}") from e | |
| if pipeline_run_info is None: | |
| raise RuntimeError("Pipeline completed but no run info was generated") | |
| return pipeline_run_info |
🤖 Prompt for AI Agents
In cognee/api/v1/add/add.py around lines 24 to 37, add error handling around the
async for loop that iterates over cognee_pipeline to catch and handle any
exceptions that may occur during iteration. After the loop, check if
pipeline_run_info is still None, and if so, raise an appropriate error or return
a meaningful response to ensure the function does not return None silently.
| return addData(dataset, files) | ||
| .then(() => { | ||
| const onUpdate = (data: NodesAndEdges) => { | ||
| onData({ | ||
| nodes: data.nodes, | ||
| links: data.edges, | ||
| }); | ||
| setProcessingFilesDone(); | ||
| }; | ||
|
|
||
| return cognifyDataset(dataset, onUpdate) | ||
| .then(() => { | ||
| refreshDatasets(); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling in file upload chain.
The promise chain lacks proper error handling, which could leave the component in a processing state indefinitely.
- return addData(dataset, files)
- .then(() => {
- const onUpdate = (data: NodesAndEdges) => {
- onData({
- nodes: data.nodes,
- links: data.edges,
- });
- setProcessingFilesDone();
- };
-
- return cognifyDataset(dataset, onUpdate)
- .then(() => {
- refreshDatasets();
- });
- });
+ return addData(dataset, files)
+ .then(() => {
+ const onUpdate = (data: NodesAndEdges) => {
+ onData({
+ nodes: data.nodes,
+ links: data.edges,
+ });
+ setProcessingFilesDone();
+ };
+
+ return cognifyDataset(dataset, onUpdate)
+ .then(() => {
+ refreshDatasets();
+ });
+ })
+ .catch((error) => {
+ console.error("Failed to process files:", error);
+ setProcessingFilesDone();
+ // Show error message to user
+ alert("Failed to process files. Please try again.");
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return addData(dataset, files) | |
| .then(() => { | |
| const onUpdate = (data: NodesAndEdges) => { | |
| onData({ | |
| nodes: data.nodes, | |
| links: data.edges, | |
| }); | |
| setProcessingFilesDone(); | |
| }; | |
| return cognifyDataset(dataset, onUpdate) | |
| .then(() => { | |
| refreshDatasets(); | |
| }); | |
| }); | |
| return addData(dataset, files) | |
| .then(() => { | |
| const onUpdate = (data: NodesAndEdges) => { | |
| onData({ | |
| nodes: data.nodes, | |
| links: data.edges, | |
| }); | |
| setProcessingFilesDone(); | |
| }; | |
| return cognifyDataset(dataset, onUpdate) | |
| .then(() => { | |
| refreshDatasets(); | |
| }); | |
| }) | |
| .catch((error) => { | |
| console.error("Failed to process files:", error); | |
| setProcessingFilesDone(); | |
| // Show error message to user | |
| alert("Failed to process files. Please try again."); | |
| }); |
🤖 Prompt for AI Agents
In cognee-frontend/src/app/(graph)/CogneeAddWidget.tsx around lines 77 to 91,
the promise chain handling file upload and dataset processing lacks error
handling, risking the component staying in a processing state indefinitely. Add
catch blocks to handle errors from addData and cognifyDataset calls, ensuring
that on errors, the component state is updated appropriately (e.g., clearing
processing flags) and errors are logged or reported to inform the user or
developer.
| if (!event.target.files) { | ||
| return; | ||
| } | ||
|
|
||
| const files: File[] = Array.from(event.target.files); | ||
|
|
||
| if (!files.length) { | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add file validation and type checking.
The file upload handling lacks validation for file types, sizes, and other constraints that could prevent processing issues.
if (!event.target.files) {
+ setProcessingFilesDone();
return;
}
const files: File[] = Array.from(event.target.files);
if (!files.length) {
+ setProcessingFilesDone();
return;
}
+ // Validate file types and sizes
+ const maxFileSize = 10 * 1024 * 1024; // 10MB
+ const allowedTypes = ['text/plain', 'application/pdf', 'application/msword'];
+
+ const invalidFiles = files.filter(file =>
+ file.size > maxFileSize || !allowedTypes.includes(file.type)
+ );
+
+ if (invalidFiles.length > 0) {
+ alert(`Invalid files detected. Please check file types and sizes.`);
+ setProcessingFilesDone();
+ return;
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (!event.target.files) { | |
| return; | |
| } | |
| const files: File[] = Array.from(event.target.files); | |
| if (!files.length) { | |
| return; | |
| } | |
| if (!event.target.files) { | |
| setProcessingFilesDone(); | |
| return; | |
| } | |
| const files: File[] = Array.from(event.target.files); | |
| if (!files.length) { | |
| setProcessingFilesDone(); | |
| return; | |
| } | |
| // Validate file types and sizes | |
| const maxFileSize = 10 * 1024 * 1024; // 10MB | |
| const allowedTypes = ['text/plain', 'application/pdf', 'application/msword']; | |
| const invalidFiles = files.filter(file => | |
| file.size > maxFileSize || !allowedTypes.includes(file.type) | |
| ); | |
| if (invalidFiles.length > 0) { | |
| alert(`Invalid files detected. Please check file types and sizes.`); | |
| setProcessingFilesDone(); | |
| return; | |
| } |
🤖 Prompt for AI Agents
In cognee-frontend/src/app/(graph)/CogneeAddWidget.tsx around lines 67 to 75,
the file upload handler currently does not validate the uploaded files. Add
checks to validate file types against allowed MIME types, verify file sizes do
not exceed defined limits, and handle any other relevant constraints before
processing the files. This will prevent invalid or problematic files from being
processed.
| }); | ||
| } | ||
|
|
||
| return global.fetch("http://localhost:8000/api" + url, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Make API base URL configurable.
The hard-coded http://localhost:8000/api base URL should be configurable through environment variables to support different deployment environments.
- return global.fetch("http://localhost:8000/api" + url, {
+ return global.fetch(process.env.NEXT_PUBLIC_API_BASE_URL + url, {And add to your environment configuration:
NEXT_PUBLIC_API_BASE_URL=http://localhost:8000/api
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return global.fetch("http://localhost:8000/api" + url, { | |
| return global.fetch(process.env.NEXT_PUBLIC_API_BASE_URL + url, { |
🤖 Prompt for AI Agents
In cognee-frontend/src/utils/fetch.ts at line 25, replace the hard-coded base
URL "http://localhost:8000/api" with a reference to an environment variable such
as process.env.NEXT_PUBLIC_API_BASE_URL. This makes the API base URL
configurable for different environments. Ensure to add
NEXT_PUBLIC_API_BASE_URL=http://localhost:8000/api to your environment
configuration file for local development.
| async def query(self, query: str, params: Optional[dict] = None) -> List[Tuple]: | ||
| """Execute a Kuzu query via the REST API.""" | ||
| try: | ||
| # Initialize schema if needed | ||
| if not self._schema_initialized: | ||
| await self._initialize_schema() | ||
|
|
||
| response = await self._make_request( | ||
| "/query", {"query": query, "parameters": params or {}} | ||
| ) | ||
|
|
||
| # Convert response to list of tuples | ||
| results = [] | ||
| if "data" in response: | ||
| for row in response["data"]: | ||
| processed_row = [] | ||
| for val in row: | ||
| if isinstance(val, dict) and "properties" in val: | ||
| try: | ||
| props = json.loads(val["properties"]) | ||
| val.update(props) | ||
| del val["properties"] | ||
| except json.JSONDecodeError: | ||
| pass | ||
| processed_row.append(val) | ||
| results.append(tuple(processed_row)) | ||
|
|
||
| return results | ||
| except Exception as e: | ||
| logger.error(f"Query execution failed: {str(e)}") | ||
| logger.error(f"Query: {query}") | ||
| logger.error(f"Parameters: {params}") | ||
| raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor query method to reduce nested complexity.
The query method has too many nested blocks (6/5 limit). Consider extracting helper methods to improve readability.
Extract response processing logic:
async def query(self, query: str, params: Optional[dict] = None) -> List[Tuple]:
"""Execute a Kuzu query via the REST API."""
try:
# Initialize schema if needed
if not self._schema_initialized:
await self._initialize_schema()
response = await self._make_request(
"/query", {"query": query, "parameters": params or {}}
)
- # Convert response to list of tuples
- results = []
- if "data" in response:
- for row in response["data"]:
- processed_row = []
- for val in row:
- if isinstance(val, dict) and "properties" in val:
- try:
- props = json.loads(val["properties"])
- val.update(props)
- del val["properties"]
- except json.JSONDecodeError:
- pass
- processed_row.append(val)
- results.append(tuple(processed_row))
-
- return results
+ return self._process_response(response)
except Exception as e:
logger.error(f"Query execution failed: {str(e)}")
logger.error(f"Query: {query}")
logger.error(f"Parameters: {params}")
raise
+def _process_response(self, response: dict) -> List[Tuple]:
+ """Process API response and convert to list of tuples."""
+ results = []
+ if "data" not in response:
+ return results
+
+ for row in response["data"]:
+ processed_row = []
+ for val in row:
+ processed_val = self._process_value(val)
+ processed_row.append(processed_val)
+ results.append(tuple(processed_row))
+ return results
+
+def _process_value(self, val):
+ """Process individual value from response."""
+ if isinstance(val, dict) and "properties" in val:
+ try:
+ props = json.loads(val["properties"])
+ val.update(props)
+ del val["properties"]
+ except json.JSONDecodeError:
+ pass
+ return val📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def query(self, query: str, params: Optional[dict] = None) -> List[Tuple]: | |
| """Execute a Kuzu query via the REST API.""" | |
| try: | |
| # Initialize schema if needed | |
| if not self._schema_initialized: | |
| await self._initialize_schema() | |
| response = await self._make_request( | |
| "/query", {"query": query, "parameters": params or {}} | |
| ) | |
| # Convert response to list of tuples | |
| results = [] | |
| if "data" in response: | |
| for row in response["data"]: | |
| processed_row = [] | |
| for val in row: | |
| if isinstance(val, dict) and "properties" in val: | |
| try: | |
| props = json.loads(val["properties"]) | |
| val.update(props) | |
| del val["properties"] | |
| except json.JSONDecodeError: | |
| pass | |
| processed_row.append(val) | |
| results.append(tuple(processed_row)) | |
| return results | |
| except Exception as e: | |
| logger.error(f"Query execution failed: {str(e)}") | |
| logger.error(f"Query: {query}") | |
| logger.error(f"Parameters: {params}") | |
| raise | |
| async def query(self, query: str, params: Optional[dict] = None) -> List[Tuple]: | |
| """Execute a Kuzu query via the REST API.""" | |
| try: | |
| # Initialize schema if needed | |
| if not self._schema_initialized: | |
| await self._initialize_schema() | |
| response = await self._make_request( | |
| "/query", {"query": query, "parameters": params or {}} | |
| ) | |
| return self._process_response(response) | |
| except Exception as e: | |
| logger.error(f"Query execution failed: {str(e)}") | |
| logger.error(f"Query: {query}") | |
| logger.error(f"Parameters: {params}") | |
| raise | |
| def _process_response(self, response: dict) -> List[Tuple]: | |
| """Process API response and convert to list of tuples.""" | |
| results = [] | |
| if "data" not in response: | |
| return results | |
| for row in response["data"]: | |
| processed_row = [] | |
| for val in row: | |
| processed_val = self._process_value(val) | |
| processed_row.append(processed_val) | |
| results.append(tuple(processed_row)) | |
| return results | |
| def _process_value(self, val): | |
| """Process individual value from response.""" | |
| if isinstance(val, dict) and "properties" in val: | |
| try: | |
| props = json.loads(val["properties"]) | |
| val.update(props) | |
| del val["properties"] | |
| except json.JSONDecodeError: | |
| pass | |
| return val |
🧰 Tools
🪛 Pylint (3.3.7)
[refactor] 84-114: Too many nested blocks (6/5)
(R1702)
🤖 Prompt for AI Agents
In cognee/infrastructure/databases/graph/kuzu/remote_kuzu_adapter.py around
lines 82 to 114, the query method is overly nested, making it hard to read.
Refactor by extracting the response processing logic that converts the response
data into a list of tuples into a separate helper method. This helper should
handle iterating over rows and processing each value, including JSON decoding of
properties. Then call this helper from the query method to reduce nesting and
improve clarity.
| data: List[UploadFile], | ||
| datasetId: Optional[UUID] = Form(default=None), | ||
| datasetName: Optional[str] = Form(default=None), | ||
| datasetId: Optional[UUID] = Form(default=None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the Form() call in argument defaults.
The static analysis tool correctly identifies a potential issue where Form(default=None) is called at function definition time rather than at request time. This could lead to unexpected behavior.
Apply this diff to fix the issue:
- datasetId: Optional[UUID] = Form(default=None),
+ datasetId: Optional[UUID] = Form(None),Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.11.9)
24-24: Do not perform function call Form in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
🤖 Prompt for AI Agents
In cognee/api/v1/add/routers/get_add_router.py at line 24, the argument default
value uses Form(default=None) which is evaluated at function definition time. To
fix this, remove the default argument from Form and instead use Form(None) so
that the default value is set at request time, preventing potential unexpected
behavior.
| username2: formElements.username2.value, | ||
| }; | ||
|
|
||
| const websocket = new WebSocket("ws://localhost:8000/api/v1/crewai/subscribe"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace hardcoded localhost URL with configurable endpoint.
The hardcoded WebSocket URL will fail in production environments. Consider using environment variables or configuration for the WebSocket endpoint.
- const websocket = new WebSocket("ws://localhost:8000/api/v1/crewai/subscribe");
+ const websocket = new WebSocket(`${process.env.NEXT_PUBLIC_WS_URL || 'ws://localhost:8000'}/api/v1/crewai/subscribe`);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const websocket = new WebSocket("ws://localhost:8000/api/v1/crewai/subscribe"); | |
| const websocket = new WebSocket(`${process.env.NEXT_PUBLIC_WS_URL || 'ws://localhost:8000'}/api/v1/crewai/subscribe`); |
🤖 Prompt for AI Agents
In cognee-frontend/src/app/(graph)/CrewAITrigger.tsx at line 31, replace the
hardcoded WebSocket URL "ws://localhost:8000/api/v1/crewai/subscribe" with a
configurable endpoint by using an environment variable or configuration setting.
This ensures the WebSocket connection works correctly in different environments,
including production. Update the code to read the WebSocket URL from process.env
or a similar configuration source.
| alert("Please fill in both name and value fields for the new property."); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace alert() with proper error feedback.
Using alert() provides poor user experience and doesn't fit modern UI patterns. Consider using a toast notification or inline error message.
- } else {
- alert("Please fill in both name and value fields for the new property.");
- }
+ } else {
+ // Add error state to show validation message
+ setPropertyError("Please fill in both name and value fields for the new property.");
+ }Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In cognee-frontend/src/app/(graph)/GraphControls.tsx around lines 87 to 88,
replace the use of alert() for error feedback with a more user-friendly approach
such as a toast notification or an inline error message component. Remove the
alert() call and implement a state variable to track the error message, then
display this message in the UI near the input fields to provide clear and
non-intrusive feedback to the user.
| user, | ||
| datasets, | ||
| graph_db_config: dict = None, | ||
| vector_db_config: dict = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inconsistent default parameter type (duplicate issue).
Same issue as above in the background process function.
-vector_db_config: dict = False,
+vector_db_config: dict = None,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| vector_db_config: dict = False, | |
| vector_db_config: dict = None, |
🤖 Prompt for AI Agents
In cognee/api/v1/cognify/cognify.py at line 89, the default value for the
parameter vector_db_config is set to False, which is inconsistent with its
declared type dict. Change the default value to an empty dictionary {} to match
the expected type and maintain consistency with other similar functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review continued from previous batch...
| user, | ||
| datasets, | ||
| graph_db_config: dict = None, | ||
| vector_db_config: dict = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inconsistent default parameter type.
The vector_db_config parameter has a default of False instead of None or {}, which is inconsistent with its dict type annotation.
-vector_db_config: dict = False,
+vector_db_config: dict = None,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| vector_db_config: dict = False, | |
| vector_db_config: dict = None, |
🤖 Prompt for AI Agents
In cognee/api/v1/cognify/cognify.py at line 67, the parameter vector_db_config
is annotated as a dict but has a default value of False, which is inconsistent.
Change the default value from False to None or an empty dictionary {} to match
the dict type annotation and avoid type errors.
Description
DCO Affirmation
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.