Skip to content

Conversation

@borisarzentar
Copy link
Member

Description

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

@borisarzentar borisarzentar requested a review from dexters1 June 30, 2025 13:09
@borisarzentar borisarzentar self-assigned this Jun 30, 2025
@pull-checklist
Copy link

Please make sure all the checkboxes are checked:

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have added end-to-end and unit tests (if applicable).
  • I have updated the documentation and README.md file (if necessary).
  • I have removed unnecessary code and debug statements.
  • PR title is clear and follows the convention.
  • I have tagged reviewers or team members for feedback.

@borisarzentar borisarzentar changed the base branch from main to dev June 30, 2025 13:10
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 30, 2025

Caution

Review failed

Failed to post review comments.

Walkthrough

This update introduces extensive enhancements and refactoring across the codebase. Key changes include: a comprehensive overhaul of backend access control and dataset management, new asynchronous pipeline and event streaming mechanisms, expanded configuration and environment variable support, and substantial improvements to database adapters and authentication backends. The frontend is also refactored, with new UI components, removal of legacy wizard and ingestion modules, and a shift to Tailwind CSS for styling.

Changes

File(s) / Group Change Summary
.env.template, cognee-starter-kit/.env.template Expanded and restructured environment variable templates for LLMs, embeddings, databases, and backend access control.
README.md, cognee-mcp/README.md, cognee-starter-kit/README.md Formatting improvements and major documentation rewrites with usage, features, and community links.
Dockerfile, cognee-frontend/Dockerfile Updated dependency options, base images, and build commands.
alembic/versions/ab7e313804ae_permission_system_rework.py Alembic migration to rework ACLs from document- to dataset-level with permission seeding.
cognee/__init__.py Early environment loading and logging setup.
cognee/api/client.py Refactored API server: CORS restriction, custom OpenAPI, new root/health endpoints, improved error handling, router reordering.
cognee/api/v1/add/add.py, cognee/api/v1/add/routers/get_add_router.py Added support for vector/graph DB config, dataset ID, and async pipeline run info streaming.
cognee/api/v1/cognify/*, cognee/api/v1/cognify/routers/get_cognify_router.py Added background pipeline runs, WebSocket event streaming, and new DTO fields for datasets and run mode.
cognee/api/v1/datasets/routers/get_datasets_router.py Added dataset creation endpoint, structured graph DTOs, permission-based dataset listing, and error specificity.
cognee/api/v1/delete/exceptions.py, cognee/api/v1/delete/routers/get_delete_router.py Added DataNotFoundError and minor route path change.
cognee/api/v1/permissions/routers/get_permissions_router.py Refactored permission/role/tenant endpoints to require authentication and support dataset-level permissions.
cognee/api/v1/search/* Added support for searching by dataset UUIDs, permission enforcement, and improved error handling.
cognee/api/v1/settings/routers/get_settings_router.py, cognee/api/v1/users/routers/* Minor route path and backend import adjustments.
cognee/api/v1/visualize/visualize.py Updated logging setup.
cognee/context_global_variables.py New context variable management for per-task DB config.
cognee/eval_framework/* Improved error handling, metric calculation retries, and async pipeline iteration.
cognee/exceptions/exceptions.py Added __str__ to CogneeApiError.
cognee/fetch_secret.py Removed AWS secret fetching script.
cognee/infrastructure/databases/graph/*, cognee/infrastructure/databases/vector/* Context-aware config retrieval, new remote Kuzu adapter, improved adapter initialization and credential handling, new methods for node/edge access, and directory/file management.
cognee/modules/codingagents/coding_rule_associations.py New module for extracting and associating developer rules using LLMs and DBs.
cognee/modules/data/exceptions/*, cognee/modules/data/methods/*, cognee/modules/data/models/* Added dataset-specific exceptions, dataset name checks, async dataset loading/creation, and ACL relationship changes.
cognee/modules/graph/methods/get_formatted_graph_data.py New function for formatting graph data for API responses.
cognee/modules/ingestion/classify.py Deferred S3 import for runtime flexibility.
cognee/modules/metrics/operations/get_pipeline_run_metrics.py Refactored to process a single pipeline run at a time.
cognee/modules/ontology/rdf_xml/OntologyResolver.py Replaced owlready2 with rdflib and refactored ontology traversal.
cognee/modules/pipelines/* Asynchronous pipeline/event streaming, deterministic pipeline/run IDs, new pipeline run info models, and event queues.
cognee/modules/retrieval/*, cognee/modules/search/* Logging, context-aware search, permission-aware search, and improved error handling.
cognee/modules/users/authentication/*, cognee/modules/users/exceptions/*, cognee/modules/users/get_fastapi_users.py Refactored authentication backends, JWT strategies, and exception classes.
cognee-frontend/* Major frontend refactor: new UI components, Tailwind CSS, removal of legacy wizard/ingestion modules, new authentication flow, and improved graph visualization.
cognee-starter-kit/* New starter kit with pipelines, data models, and usage documentation.
types/d3-force-3d.d.ts New TypeScript declarations for 3D force-directed graph layouts.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Frontend
    participant Backend
    participant DB

    User->>Frontend: Initiate dataset creation or data upload
    Frontend->>Backend: POST /v1/datasets (with user info)
    Backend->>DB: Create dataset, assign permissions
    DB-->>Backend: Dataset created, permissions set
    Backend-->>Frontend: Dataset info

    User->>Frontend: Start pipeline (add/cognify)
    Frontend->>Backend: POST /v1/add or /v1/cognify (with dataset ID)
    Backend->>DB: Start pipeline, set context variables
    Backend-->>Frontend: Pipeline run info (event stream via WebSocket)
    Backend->>DB: Update pipeline run status/events

    User->>Frontend: Search/query data
    Frontend->>Backend: POST /v1/search (with dataset IDs)
    Backend->>DB: Permission check, context setup
    Backend->>DB: Run search/query
    DB-->>Backend: Search results
    Backend-->>Frontend: Results
Loading

Possibly related PRs

  • topoteretes/cognee#742: Also modifies the .env.template file, but only adds default user credentials; shares configuration context.
  • topoteretes/cognee#903: Both PRs update the Neo4jAdapter for consistent node labeling and uniqueness constraints.
  • topoteretes/cognee#953: Both PRs introduce and document the ENABLE_BACKEND_ACCESS_CONTROL variable in .env.template.

Suggested labels

run-checks

Poem

In fields of code where data flows,
A rabbit hops where backend grows.
Context set, permissions tight,
Pipelines run through day and night.
Graphs and nodes, a vibrant scene—
Refactored fresh, and running clean!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate Unit Tests
  • Create PR with Unit Tests
  • Post Copyable Unit Tests in Comment
  • Commit Unit Tests in branch fix/aithorize-in-swagger

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@gitguardian
Copy link

gitguardian bot commented Jun 30, 2025

⚠️ GitGuardian has uncovered 4 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
17350889 Triggered Generic Password 4eb71cc cognee/tests/test_remote_kuzu.py View secret
16755449 Triggered Generic Password 82e3537 .env.template View secret
7122536 Triggered Generic High Entropy Secret 97d05f1 .github/workflows/search_db_tests.yml View secret
17116131 Triggered Generic Password 3b07f3c examples/database_examples/neo4j_example.py View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@borisarzentar borisarzentar changed the title fix/: authorize in swagger fix: authorize in swagger Jun 30, 2025
@borisarzentar borisarzentar merged commit da14497 into dev Jun 30, 2025
59 of 65 checks passed
@borisarzentar borisarzentar deleted the fix/aithorize-in-swagger branch June 30, 2025 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants