Skip to content

Conversation

@alekszievr
Copy link
Contributor

@alekszievr alekszievr commented Mar 11, 2025

Description

Scripts for evaluating the dev branch of Cognee and sending results to segment.
The scripts will be called in the analytics repo:
https://github.com/topoteretes/cognee-analytics/blob/main/.github/workflows/regular_cognee_eval_to_segment.yml

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features
    • Launched an asynchronous evaluation pipeline that streamlines processing and generates output reports.
    • Integrated automated dispatch of evaluation metrics to an analytics platform for enhanced performance monitoring.
  • Refactor
    • Updated how configuration values are retrieved from the environment to ensure more reliable runtime behavior.
  • Chores
    • Added a new optional dependency for Segment analytics to enhance data tracking capabilities.

@gitguardian
Copy link

gitguardian bot commented Mar 11, 2025

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
9573981 Triggered Generic Password b17393c helm/values.yaml View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2025

Walkthrough

The changes modify environment variable access in the modal image configuration by replacing string interpolation with direct calls to os.getenv(). Additionally, two new evaluation modules are introduced: one defines an asynchronous evaluation pipeline via the modal_run_eval function in modal_run_regular_eval.py, and the other adds functionality to send evaluation results to Segment using PostHog analytics in send_results_to_segment.py. These updates improve runtime variable retrieval and introduce asynchronous evaluation and results reporting.

Changes

File(s) Summary
cognee/eval_framework/modal_run_eval.py Replaced string interpolation (e.g., "$ENV") with os.getenv() calls to fetch environment variables dynamically.
cognee/tests/evaluation/modal_run_regular_eval.py
cognee/tests/evaluation/send_results_to_segment.py
Introduced a new asynchronous evaluation pipeline in modal_run_regular_eval.py (with modal_run_eval and main functions) and a new script in send_results_to_segment.py to send evaluation results to Segment using PostHog analytics, along with error handling and logging.
pyproject.toml Added new optional dependency segment-analytics-python version 2.3.3 and updated the analytics extra to include this dependency.

Sequence Diagram(s)

sequenceDiagram
  participant Main
  participant EvalFunc as modal_run_eval()
  participant Corpus as Corpus Builder
  participant QA as QA Module
  participant Evaluator as Evaluation Module
  participant FS as File System

  Main->>EvalFunc: Call modal_run_eval(eval_params)
  EvalFunc->>Corpus: Execute corpus building
  Corpus-->>EvalFunc: Return corpus data
  EvalFunc->>QA: Execute question-answering module
  QA-->>EvalFunc: Return QA results
  EvalFunc->>Evaluator: Run evaluation module
  Evaluator-->>EvalFunc: Return evaluation metrics
  EvalFunc->>FS: Write results to metrics_output.json
  FS-->>Main: Confirm file save
Loading
sequenceDiagram
  participant Main
  participant Sender as send_event_to_segment()
  participant Analytics as Segment (PostHog)

  Main->>Sender: Provide results data
  Sender->>Analytics: Send "cognee_eval_results" event
  Analytics-->>Sender: Acknowledge event reception
  Sender->>Analytics: Flush pending events
Loading

Suggested reviewers

  • lxobr
  • hajdul88

Poem

I'm hopping through the code with glee,
Each new change as sweet as can be!
Variables now fetched with rabbit care,
Pipelines and events leaping through the air.
My whiskers twitch with every clever bit—
This hop of progress is an absolute hit! 🐰

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)
  • We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
    - To enable this feature, set early_access to true under in the settings.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7cf30a7 and 6a8380c.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • cognee/tests/evaluation/send_results_to_segment.py (1 hunks)
  • pyproject.toml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cognee/tests/evaluation/send_results_to_segment.py
⏰ Context from checks skipped due to timeout of 90000ms (24)
  • GitHub Check: run_simple_example_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: Test on macos-15
  • GitHub Check: Test on macos-15
  • GitHub Check: test
  • GitHub Check: Test on macos-13
  • GitHub Check: Test on macos-13
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_multimedia_example_test / test
  • GitHub Check: Test on ubuntu-22.04
  • GitHub Check: Test on ubuntu-22.04
  • GitHub Check: run_eval_framework_test / test
  • GitHub Check: run_dynamic_steps_example_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: Test on macos-15
  • GitHub Check: Test on macos-13
  • GitHub Check: Test on ubuntu-22.04
  • GitHub Check: test
  • GitHub Check: Test on ubuntu-22.04
  • GitHub Check: test
  • GitHub Check: windows-latest
  • GitHub Check: docker-compose-test
  • GitHub Check: run_simple_example_test
🔇 Additional comments (2)
pyproject.toml (2)

93-93: New Optional Dependency Added for Segment Analytics

The dependency segment-analytics-python is now added as an optional dependency with version "2.3.3". This aligns with the PR’s objective of sending evaluation results to Segment. Please verify that the chosen version is compatible with the rest of the analytics workflow and up-to-date with current Segment recommendations.


120-120: Analytics Extra Updated

The analytics extra now includes segment-analytics-python, providing users a convenient option to install the dependencies needed for Segment integration. Ensure that any related documentation or integration tests are updated accordingly to reflect this new extra.


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 9

🔭 Outside diff range comments (2)
cognee-mcp/README.md (1)

4-7: ⚠️ Potential issue

Remove Unresolved Merge Conflict Markers

It appears that unresolved merge conflict markers (e.g., the "======") are present between lines 4 and 7. These artifacts should be removed to ensure the documentation is clean and does not confuse readers.

helm/Dockerfile (1)

1-60: 💡 Verification agent

🧩 Analysis chain

Dockerfile needs optimization for build efficiency and image size

The Dockerfile is missing some best practices that could improve build efficiency and reduce image size.

Consider these improvements:

 FROM python:3.11-slim

 # Define Poetry extras to install
 ARG POETRY_EXTRAS="\
 # Storage & Databases \
 filesystem postgres weaviate qdrant neo4j falkordb milvus \
 # Notebooks & Interactive Environments \
 notebook \
 # LLM & AI Frameworks \
 langchain llama-index gemini huggingface ollama mistral groq \
 # Evaluation & Monitoring \
 deepeval evals posthog \
 # Graph Processing & Code Analysis \
 codegraph graphiti \
 # Document Processing \
 docs"

 # Set build argument
 ARG DEBUG

 # Set environment variable based on the build argument
 ENV DEBUG=${DEBUG}
 ENV PIP_NO_CACHE_DIR=true
 ENV PATH="${PATH}:/root/.poetry/bin"


-RUN apt-get install -y \
+RUN apt-get update && apt-get install -y \
   gcc \
-  libpq-dev
+  libpq-dev \
+  && apt-get clean \
+  && rm -rf /var/lib/apt/lists/*


 WORKDIR /app
 COPY pyproject.toml poetry.lock /app/


 RUN pip install poetry

 # Don't create virtualenv since docker is already isolated
 RUN poetry config virtualenvs.create false

 # Install the dependencies
 RUN poetry install --extras "${POETRY_EXTRAS}" --no-root --without dev


 # Set the PYTHONPATH environment variable to include the /app directory
 ENV PYTHONPATH=/app

 COPY cognee/ /app/cognee

 # Copy Alembic configuration
 COPY alembic.ini /app/alembic.ini
 COPY alembic/ /app/alembic

 COPY entrypoint.sh /app/entrypoint.sh
 RUN chmod +x /app/entrypoint.sh

 RUN sed -i 's/\r$//' /app/entrypoint.sh

 ENTRYPOINT ["/app/entrypoint.sh"]

These changes:

  1. Add apt-get update before installing packages
  2. Clean up apt cache after installation to reduce image size
  3. Use a single RUN command for apt operations to reduce layers

🏁 Script executed:

#!/bin/bash
# Check if there are any other Dockerfiles in the project for reference on best practices
fd -t f "Dockerfile" --exclude "helm/Dockerfile"

Length of output: 123


Optimize apt-get Commands in helm/Dockerfile for Efficiency and Reduced Image Size

The helm/Dockerfile can be improved by:

  • Running apt-get update before installing packages so that the package lists are current.
  • Combining the update, install, and cleanup commands into a single RUN command to reduce layers.
  • Cleaning up the apt cache (using apt-get clean and removing /var/lib/apt/lists/*) after installation to keep the final image lean.

Here's the recommended change:

 FROM python:3.11-slim

 # Define Poetry extras to install
 ARG POETRY_EXTRAS="\
 # Storage & Databases \
 filesystem postgres weaviate qdrant neo4j falkordb milvus \
 # Notebooks & Interactive Environments \
 notebook \
 # LLM & AI Frameworks \
 langchain llama-index gemini huggingface ollama mistral groq \
 # Evaluation & Monitoring \
 deepeval evals posthog \
 # Graph Processing & Code Analysis \
 codegraph graphiti \
 # Document Processing \
 docs"

 # Set build argument
 ARG DEBUG

 # Set environment variable based on the build argument
 ENV DEBUG=${DEBUG}
 ENV PIP_NO_CACHE_DIR=true
 ENV PATH="${PATH}:/root/.poetry/bin"

- RUN apt-get install -y \
-   gcc \
-   libpq-dev
+ RUN apt-get update && apt-get install -y \
+   gcc \
+   libpq-dev \
+  && apt-get clean \
+  && rm -rf /var/lib/apt/lists/*

 WORKDIR /app
 COPY pyproject.toml poetry.lock /app/

 RUN pip install poetry

 # Don't create virtualenv since docker is already isolated
 RUN poetry config virtualenvs.create false

 # Install the dependencies
 RUN poetry install --extras "${POETRY_EXTRAS}" --no-root --without dev

 # Set the PYTHONPATH environment variable to include the /app directory
 ENV PYTHONPATH=/app

 COPY cognee/ /app/cognee

 # Copy Alembic configuration
 COPY alembic.ini /app/alembic.ini
 COPY alembic/ /app/alembic

 COPY entrypoint.sh /app/entrypoint.sh
 RUN chmod +x /app/entrypoint.sh

 RUN sed -i 's/\r$//' /app/entrypoint.sh

 ENTRYPOINT ["/app/entrypoint.sh"]

Implementing these changes aligns the helm/Dockerfile with best practices observed in the other Dockerfiles in the project, enhancing both build efficiency and image size.

🧹 Nitpick comments (30)
README.md (2)

47-47: Missing comma in list.

Consider adding a comma before "and" in the list of interconnected items for proper grammar.

-Interconnect and retrieve your past conversations, documents, images and audio transcriptions
+Interconnect and retrieve your past conversations, documents, images, and audio transcriptions
🧰 Tools
🪛 LanguageTool

[uncategorized] ~47-~47: Possible missing preposition found.
Context: ...ents, images and audio transcriptions - Reduce hallucinations, developer effort, and c...

(AI_HYDRA_LEO_MISSING_TO)


159-159: Missing question mark.

If "What is AI memory" is a question, it should end with a question mark.

-What is AI memory:
+What is AI memory?
🧰 Tools
🪛 LanguageTool

[typographical] ~159-~159: If the word ‘What’ starts a question, add a question mark at the end of the sentence.
Context: .../> ## Demos What is AI memory: [Learn about cognee](https://github.co...

(WP_VB_QUESTION_MARK)

CONTRIBUTING.md (1)

38-38: Bare email address could lead to formatting issues.

Consider wrapping the email address in angle brackets or markdown code formatting to ensure consistent rendering across different markdown processors.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

38-38: Bare URL used
null

(MD034, no-bare-urls)

helm/README.md (1)

1-25: Good documentation for Helm chart deployment

The README provides clear instructions for installing and configuring the Helm chart for Cognee on Kubernetes.

There's a minor markdown style issue - remove the trailing colon in the "Deploy Helm Chart:" heading:

-## Deploy Helm Chart:
+## Deploy Helm Chart
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

16-16: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)

helm/values.yaml (1)

1-22: Good configuration structure, but consider security best practices

The Helm values file is well structured and provides comprehensive configuration for both the 'cognee' application and 'postgres' database services. However, I have a few suggestions:

  1. Consider using external secrets management instead of hardcoding database credentials in values.yaml
  2. The image reference "hajdul1988/cognee-backend:latest" should be versioned instead of using the "latest" tag for better stability and reproducibility
  3. The resource allocations (4 CPU cores, 8GB memory) are quite substantial and might be optimized based on actual usage patterns

For better security and maintainability, consider implementing these changes:

# Configuration for the 'postgres' database service
postgres:
  image: "pgvector/pgvector:pg17"
  port: 5432
  env:
-    POSTGRES_USER: "cognee"
-    POSTGRES_PASSWORD: "cognee"
+    # Reference secrets from a secure store
+    POSTGRES_USER: "{{ .Values.postgresql.auth.username }}"
+    POSTGRES_PASSWORD: "{{ .Values.postgresql.auth.password }}"
    POSTGRES_DB: "cognee_db"
  storage: "8Gi"
cognee/tasks/entity_completion/entity_extractors/regex_entity_config.json (2)

1-62: Comprehensive entity regex definitions, but some patterns could be improved.

The entity configuration is well-structured with consistent formatting. However, some regex patterns could be enhanced:

  1. The URL pattern (line 17) doesn't account for all valid TLDs and might miss URLs with paths containing special characters.
  2. The PERSON pattern (line 41) is quite restrictive and may miss names with apostrophes or hyphens.
  3. The DATE pattern (line 23) only handles specific formats (YYYY-MM-DD or DD-MM-YYYY) and misses many common date formats.

Consider improving these regex patterns for better coverage:

-        "regex": "https?:\\/\\/(www\\.)?[a-zA-Z0-9-]+(\\.[a-zA-Z]{2,})+(\\/\\S*)?",
+        "regex": "https?:\\/\\/(www\\.)?[-a-zA-Z0-9@:%._\\+~#=]{1,256}\\.[a-zA-Z0-9()]{1,6}\\b([-a-zA-Z0-9()@:%_\\+.~#?&//=]*)",

-        "regex": "(\\d{4}[-/]\\d{2}[-/]\\d{2})|(\\d{2}[-/]\\d{2}[-/]\\d{4})",
+        "regex": "(\\d{4}[-/\\.](0?[1-9]|1[0-2])[-/\\.](0?[1-9]|[12]\\d|3[01]))|((?:0?[1-9]|[12]\\d|3[01])[-/\\.](?:0?[1-9]|1[0-2])[-/\\.]\\d{4})",

-        "regex": "\\b(?:(?:Dr|Prof|Mr|Mrs|Ms)\\.?\\s+)?[A-Z][a-z]+(?:\\s+(?:[A-Z][a-z]+|[A-Z]\\.?|(?:van|de|la|del|von|der|le)))+\\b",
+        "regex": "\\b(?:(?:Dr|Prof|Mr|Mrs|Ms)\\.?\\s+)?[A-Z][a-z]+(?:[-'\\s][A-Z][a-z]+)*(?:\\s+(?:[A-Z][a-z]+(?:[-'\\s][A-Z][a-z]+)*|[A-Z]\\.?|(?:van|de|la|del|von|der|le)))*\\b",

11-12: Phone regex pattern might miss valid international formats.

The current phone pattern doesn't fully account for the variety of international phone number formats, which could lead to missed valid phone numbers.

Consider using a more comprehensive regex for international phone numbers:

-        "regex": "\\+?\\d{1,4}[\\s-]?\\(?\\d{2,4}\\)?[\\s-]?\\d{3,4}[\\s-]?\\d{3,4}",
+        "regex": "(?:\\+|00)[1-9]\\d{0,3}[\\s-]?(?:\\(\\d{1,4}\\)|\\d{1,4})[\\s-]?\\d{1,4}[\\s-]?\\d{1,4}|\\(?\\d{3}\\)?[-\\s]?\\d{3}[-\\s]?\\d{4}",
helm/templates/postgres_service.yaml (1)

14-15: Remove excessive blank lines at end of file

There are extra blank lines at the end of the file that should be removed to comply with YAML best practices.

    app: {{ .Release.Name }}-postgres
-

-
🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 14-14: too many blank lines

(1 > 0) (empty-lines)

cognee/eval_framework/evaluation/deep_eval_adapter.py (1)

36-37: Consider adding error handling for retrieval_context

While the code handles the case where golden_context might be missing, it assumes retrieval_context is always present. Consider adding similar error handling for retrieval_context to prevent potential runtime errors.

-  retrieval_context=[answer["retrieval_context"]],
+  retrieval_context=[answer["retrieval_context"]] if "retrieval_context" in answer else [],
cognee/eval_framework/evaluation/evaluation_executor.py (1)

24-27: Consider creating a new metrics list instead of modifying the input

The current implementation modifies the evaluator_metrics list that was passed in by the caller. While this works, it could lead to unexpected behavior if the caller doesn't expect their list to be modified.

Consider creating a new list instead:

- if self.evaluate_contexts:
-     evaluator_metrics.append("contextual_relevancy")
-     evaluator_metrics.append("context_coverage")
+ metrics_to_evaluate = list(evaluator_metrics)  # Create a copy
+ if self.evaluate_contexts:
+     metrics_to_evaluate.append("contextual_relevancy")
+     metrics_to_evaluate.append("context_coverage")
+ metrics = await self.eval_adapter.evaluate_answers(answers, metrics_to_evaluate)
helm/docker-compose-helm.yml (2)

19-19: Consider documenting the debugging port usage

The commented-out debugging port suggests that there's a debugging capability that's disabled by default. Consider adding a comment explaining how to enable and use this debugging functionality.


26-38: Consider using environment variables for database credentials

The PostgreSQL credentials are hardcoded in the file. It would be more secure and flexible to use environment variables instead.

  postgres:
    image: pgvector/pgvector:pg17
    container_name: postgres
    environment:
-     POSTGRES_USER: cognee
-     POSTGRES_PASSWORD: cognee
-     POSTGRES_DB: cognee_db
+     POSTGRES_USER: ${POSTGRES_USER:-cognee}
+     POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-cognee}
+     POSTGRES_DB: ${POSTGRES_DB:-cognee_db}

This change allows overriding the credentials through environment variables while maintaining the defaults.

cognee-mcp/Dockerfile (5)

8-8: Consider verifying the impact of disabling bytecode compilation.
If performance is critical, reintroducing or re-validating ENV UV_COMPILE_BYTECODE=1 may help.


16-16: Consider adding a default for DEBUG environment variable.
If $DEBUG is not set at build time, ENV DEBUG=${DEBUG} might result in an empty variable.


25-25: Potential additional dependency files.
If using Poetry or storing dependencies elsewhere, consider also copying those lockfiles.


31-31: Security consideration for copying .env files.
Be cautious about shipping secrets in container images.


44-44: Copying the local user environment directories in their entirety.
This may pull in more files than desired. Consider copying only necessary subdirectories if image size is a concern.

cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (1)

86-86: Potential naming collision in label assignment.
Using type(node).__name__ is convenient, but changes to class names or duplicates across classes may cause confusion in Neo4j.

examples/low_level/product_recommendation.py (1)

62-109: Robust JSON parsing for ingesting data.
Consider adding error handling for file-not-found or invalid JSON.

helm/templates/postgres_deployment.yaml (1)

8-8: Consider scaling and resource management.

Currently, replicas: 1 will not offer high availability. Adding a readiness probe, liveness probe, and resource requests/limits can help ensure stable performance under load.

cognee/eval_framework/answer_generation/answer_generation_executor.py (1)

11-14: Ensure instances instead of classes are stored in the dictionary.

You are assigning classes (e.g., GraphCompletionRetriever) to retriever_options rather than class instances. Ensure you instantiate the classes before passing them to the question_answering_non_parallel method, or adjust how the dictionary is used if that is the intended design.

-retriever_options: Dict[str, BaseRetriever] = {
-    "cognee_graph_completion": GraphCompletionRetriever,
-    "cognee_completion": CompletionRetriever,
-    "graph_summary_completion": GraphSummaryCompletionRetriever,
-}
+retriever_options: Dict[str, BaseRetriever] = {
+    "cognee_graph_completion": GraphCompletionRetriever(),
+    "cognee_completion": CompletionRetriever(),
+    "graph_summary_completion": GraphSummaryCompletionRetriever(),
+}
cognee/tasks/entity_completion/entity_extractors/regex_entity_extractor.py (3)

15-25: Confirm concurrency implications of reading config at initialization.

Loading the configuration file and logging the entity types is typically fine for single-instance usage. If multiple instances of this extractor are hot-loaded in parallel, ensure thread safety or load the config outside the constructor in a shared manner if needed.


27-34: Use named arguments for clarity in Entity constructor.

While positional arguments work, explicitly naming the fields (e.g. name=match_text, is_a=entity_type_obj) can improve readability and reduce errors if the Entity class definition changes.


61-73: Ensure large input text is handled efficiently.

Regex matching on very large strings can be time-consuming. If the input can be lengthy, consider timeouts, chunking the text, or other techniques to manage performance and resource usage.

notebooks/pokemon_datapoints_notebook.ipynb (2)

60-62: Consider making fetch limit configurable.
The Pokémon fetch limit is hard-coded to 50. Exposing it as a notebook parameter or environment variable would improve flexibility.


364-369: Perform selective data pruning.
Pruning all data and system metadata each time can be expensive. Consider implementing a more targeted pruning approach if you only need partial cleanup.

examples/python/pokemon_datapoints_example.py (1)

76-79: Refine HTTP error handling logic.
You are calling response.raise_for_status(), which is good, but consider customizing error messages or implementing retries for transient failures.

cognee/tasks/entity_completion/entity_extractors/regex_entity_config.py (2)

32-39: Guard against catastrophic backtracking in user-defined patterns.
Regex patterns can lead to performance issues if they allow catastrophic backtracking. Consider using a safer regex engine, timeouts, or patterns known to be linear.


52-66: Prevent duplicated entity names.
As the configs are stored in a dictionary keyed by entity name, adding a check for duplicate entity_name entries would prevent accidental overwrites.

cognee/tests/unit/entity_extraction/regex_entity_extraction_test.py (1)

184-201: Clarify the comment about invalid IP addresses.

There's a mismatch between the comment "None of these should be extracted as valid IPs" and the assertions. The test expects one valid IP to be extracted (1.102.103.104) but the comment suggests that none should be extracted.

-    # None of these should be extracted as valid IPs
+    # Most of these invalid IPs should not be extracted, but 1.102.103.104 is valid
     assert len(ip_entities) == 1
     assert "999.999.999.999" not in [e.name for e in ip_entities]
     assert "256.256.256.256" not in [e.name for e in ip_entities]
     assert "1.2.3.4.5" not in [e.name for e in ip_entities]
     assert "01.102.103.104" not in [e.name for e in ip_entities]
     assert "1.102.103.104" in [e.name for e in ip_entities]
🛑 Comments failed to post (9)
cognee/modules/retrieval/completion_retriever.py (1)

26-29: ⚠️ Potential issue

Issue: Only the first retrieved chunk is used despite retrieving top_k chunks

While the code now retrieves top_k chunks from the vector engine, it still only returns the first chunk at index 0 in line 29. This creates a disconnect between the retrieval capability and the actual usage.

Consider modifying the return statement to utilize all retrieved chunks:

- return found_chunks[0].payload["text"]
+ # If only one chunk is requested, return it directly
+ if self.top_k == 1:
+     return found_chunks[0].payload["text"]
+ # Otherwise, return all chunks' text
+ return [chunk.payload["text"] for chunk in found_chunks]

This change would require updating the get_completion method and other dependent code to handle the new return type.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

        found_chunks = await vector_engine.search("DocumentChunk_text", query, limit=self.top_k)
        if len(found_chunks) == 0:
            raise NoRelevantDataFound
        # If only one chunk is requested, return it directly
        if self.top_k == 1:
            return found_chunks[0].payload["text"]
        # Otherwise, return all chunks' text
        return [chunk.payload["text"] for chunk in found_chunks]
cognee/eval_framework/metrics_dashboard.py (1)

68-71: 🛠️ Refactor suggestion

Guard against missing data.

When retrieving data with {key: entry[key] for key in current_metrics_fields}, consider the risk of KeyError if certain keys are absent in entry. Ensure that upstream code or data loading logic consistently populates these fields, or use a safe retrieval strategy (e.g., entry.get(key, "")) to prevent runtime errors.

cognee/eval_framework/evaluation/metrics/context_coverage.py (1)

8-33: 🛠️ Refactor suggestion

Handle potential empty contexts in measure method.

Using test_case.context[0] and test_case.retrieval_context[0] assumes these lists are non-empty. If either is unexpectedly empty, it may cause an IndexError. Add a safety check or gracefully handle empty or malformed test cases to avoid runtime crashes.

cognee-mcp/Dockerfile (2)

17-17: ⚠️ Potential issue

Potential duplication of ENV DEBUG.
Having two consecutive ENV DEBUG=${DEBUG} lines may be redundant or confusing. Consider removing the extra line.

-ENV DEBUG=${DEBUG}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.



32-32: 💡 Verification agent

🧩 Analysis chain

Verify that sensitive data is not inadvertently committed.
Copying .env* can lead to potential credential exposure in images.


🏁 Script executed:

#!/bin/bash
# Search for .env files with potential secrets
rg --json "SECRET|PASSWORD|TOKEN" .env*

Length of output: 1860


Security Risk: Reassess the Inclusion of Environment Files in the Image

  • The glob pattern COPY .env* /app/ currently copies all files matching .env*, which includes files like .env.template.
  • Our verification found that .env.template contains lines with keywords such as PASSWORD and TOKEN, indicating that sensitive or placeholder credential fields might be present.
  • Action Required:
    • Double-check that none of the copied files (including templates) contain actual secrets or sensitive data intended only for local/development use.
    • If .env.template is meant solely for demonstration, consider renaming it to a non-sensitive name (e.g., .env.example) or explicitly excluding it from the image build.
    • Alternatively, use a .dockerignore file or a more specific COPY directive to ensure only the intended, non-sensitive environment files are included.
helm/templates/postgres_deployment.yaml (1)

25-26: 🛠️ Refactor suggestion

Avoid storing sensitive credentials in plain text.

Storing POSTGRES_PASSWORD directly in the Deployment manifest is a potential security risk, as it may be exposed in version control or logs. Consider referencing a Kubernetes secret instead.

Below is an example snippet for using a Secret reference:

- name: POSTGRES_PASSWORD
- value: {{ .Values.postgres.env.POSTGRES_PASSWORD }}
+ name: POSTGRES_PASSWORD
+ valueFrom:
+   secretKeyRef:
+     name: {{ .Release.Name }}-postgres-secret
+     key: POSTGRES_PASSWORD
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: {{ .Release.Name }}-postgres-secret
                  key: POSTGRES_PASSWORD
cognee/eval_framework/answer_generation/answer_generation_executor.py (1)

29-31: ⚠️ Potential issue

Handle empty retrieval results to avoid IndexError.

search_results[0] will raise an IndexError if search_results is empty. Add a check or fallback for cases where nothing is returned.

search_results = await retriever.get_completion(query_text, retrieval_context)
-if search_results:
-    answer = {
-        "answer": search_results[0],
-        ...
-    }
-else:
-    # Fallback logic here

Committable suggestion skipped: line range outside the PR's diff.

notebooks/pokemon_datapoints_notebook.ipynb (1)

466-500: 💡 Verification agent

🧩 Analysis chain

Ensure robust error handling for search queries.
While the code calls cognee.search, it lacks a try-except flow to gracefully handle failed requests or connectivity issues. Confirm the error handling meets your reliability needs.


🏁 Script executed:

#!/bin/bash
# Verify if any explicit try-except blocks protect the `cognee.search` call.
rg 'cognee\.search' -A 5

Length of output: 23260


Add Robust Error Handling for cognee.search Calls

It appears that in notebooks/pokemon_datapoints_notebook.ipynb (lines 466-500) the asynchronous call to cognee.search is not wrapped in a try-except block. This leaves the code vulnerable to unhandled exceptions in cases of API failures or connectivity issues. To improve the robustness of the notebook, please consider adding try-except logic around the await cognee.search(...) call so that errors can be caught and managed appropriately.

examples/python/pokemon_datapoints_example.py (1)

171-198: 🛠️ Refactor suggestion

Refine concurrency or partial failure handling in Cognee tasks.
While the async flow is appropriate, consider how partial failures (e.g., if some data points can’t be added) are handled, and whether data consistency is preserved.

@lxobr
Copy link
Collaborator

lxobr commented Mar 13, 2025

This looks good to me! I only really understand the part about Modal and it seems correct. I would recommend that someone more familiar with posthog and github actions approves this.

f"mean_{key}": results["aggregate_metrics"][key]["mean"]
for key in results["aggregate_metrics"].keys()
}
posthog.capture(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should send these events to segment instead of directly to posthog?
We already have send_telemetry in cognee repo, so maybe we can reuse that one.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
cognee/tests/evaluation/modal_run_regular_eval.py (1)

58-60: Check for None results before writing JSON.

When the pipeline skips metrics, your script may dump None to the JSON file. While valid JSON, it can be confusing or break the next stage. A simple conditional check before dumping can prevent misinterpretation of the output.

cognee/tests/evaluation/send_results_to_segment.py (3)

1-1: Remove or replace the unused Posthog import.

You import Posthog from posthog, but the code relies on analytics instead. This unused import can lead to confusion about which analytics library is in use.

-from posthog import Posthog

30-38: Add error handling for analytics.track.

At present, if the call to analytics.track or subsequent flush fails, it may go unreported. Consider wrapping these calls in a try-except block or logging potential exceptions to facilitate troubleshooting.


40-46: Clarify references to “PostHog” vs “Segment.”

Your CLI help text and environment variables mention “PostHog,” but the code references sending events to Segment analytics. Aligning names and logs reduces confusion for future contributors.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bcd8828 and 7bb8a73.

📒 Files selected for processing (2)
  • cognee/tests/evaluation/modal_run_regular_eval.py (1 hunks)
  • cognee/tests/evaluation/send_results_to_segment.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (33)
  • GitHub Check: Test on macos-15
  • GitHub Check: Test on macos-13
  • GitHub Check: Test on macos-15
  • GitHub Check: run_notebook_test / test
  • GitHub Check: Test on ubuntu-22.04
  • GitHub Check: Test on macos-15
  • GitHub Check: run_notebook_test / test
  • GitHub Check: Test on macos-13
  • GitHub Check: run_notebook_test / test
  • GitHub Check: Test on macos-13
  • GitHub Check: chromadb test
  • GitHub Check: Test on ubuntu-22.04
  • GitHub Check: test
  • GitHub Check: Test on ubuntu-22.04
  • GitHub Check: run_notebook_test / test
  • GitHub Check: Test on ubuntu-22.04
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: run_networkx_metrics_test / test
  • GitHub Check: run_simple_example_test / test
  • GitHub Check: windows-latest
  • GitHub Check: run_dynamic_steps_example_test / test
  • GitHub Check: run_eval_framework_test / test
  • GitHub Check: run_multimedia_example_test / test
  • GitHub Check: test
  • GitHub Check: lint (ubuntu-latest, 3.11.x)
  • GitHub Check: lint (ubuntu-latest, 3.10.x)
  • GitHub Check: test
  • GitHub Check: docker-compose-test
  • GitHub Check: run_simple_example_test
🔇 Additional comments (3)
cognee/tests/evaluation/modal_run_regular_eval.py (2)

22-23: Question the default configuration fallback.

When eval_params is None, you instantiate a complete default configuration. Ensure this is always the intended behavior. If any scenario requires minimal or empty evaluation settings, you might inadvertently overrun them with defaults.

Do you want to confirm that running the entire pipeline with default settings is always acceptable? If not, consider providing a more minimal fallback or returning an empty dictionary.


32-38: Avoid returning None if downstream scripts expect valid JSON.

Currently, if metrics are skipped, the function returns None, which leads to "null" in JSON. Some consumers of your results may break if they rely on a dictionary structure.

Below is a quick fix to return an empty dictionary instead of None:

-if not eval_params.get("evaluating_answers") or not eval_params.get("calculate_metrics"):
-    logger.info("Skipping metrics collection ...")
-    return None
+if not eval_params.get("evaluating_answers") or not eval_params.get("calculate_metrics"):
+    logger.info("Skipping metrics collection ...")
+    return {}
cognee/tests/evaluation/send_results_to_segment.py (1)

16-17: Validate SEGMENT_WRITE_KEY before use.

If SEGMENT_WRITE_KEY is unset or invalid, calling analytics.track will silently fail or raise unexpected errors. Consider handling this scenario with a log warning or exception.

Do you want to add a check to raise a clear error if SEGMENT_WRITE_KEY is None or empty?

@alekszievr alekszievr changed the title feat: regular cognee eval to posthog [COG-1414] feat: regular cognee eval to segment [COG-1414] Mar 14, 2025
from dotenv import load_dotenv
import argparse
from cognee.shared.utils import setup_logging
import analytics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we get analytics from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified this to use the newer name of the package and added it to poetry

properties["created_at"] = created_at

# Send event to Segment
analytics.track(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If analytics is some analytics library, then I don't understand why we have this test?

@soobrosa soobrosa deleted the feat/cog-1414-regular-cognee-eval branch May 28, 2025 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants