Add unit tests for storage utility functions (#2312) by haroldfabla2-hue · Pull Request #2338 · topoteretes/cognee

haroldfabla2-hue · 2026-03-09T09:06:17Z

Summary

Adds unit tests for the storage utility functions in cognee/modules/storage/utils/__init__.py as requested in issue #2312.

Changes

Created cognee/tests/unit/modules/storage/test_utils.py with:
- 5 unit tests for copy_model() covering basic copying, include_fields, exclude_fields, combined usage, and edge cases
- 3 unit tests for get_own_properties() covering basic extraction, nested object exclusion, and primitive value preservation
Created cognee/tests/unit/modules/storage/__init__.py (required for pytest discovery)

Testing

Tests are designed to run without external dependencies (pure unit tests with mocked LLM where needed).

Closes #2312

Summary by CodeRabbit

New Features
- Added Python-based container entrypoint with environment variable configuration and automated database migrations.
- Introduced debug mode and environment-specific server configuration.
Improvements
- Updated database schema handling to support multi-schema setups via search_path.
Tests
- Added comprehensive unit tests for storage utility functions.
Chores
- Updated Docker configuration for improved Windows/Docker compatibility.

- Add entrypoint.py as portable alternative to bash script - Windows Docker Desktop has issues executing .sh files - Python entrypoint works cross-platform without shell dependencies - Fallback bash entrypoint.sh kept for Linux/macOS environments Fixes: topoteretes#2274

…coded 'public' schema This fixes the bug where delete_dataset() fails in non-public Postgres schemas because the schema was hardcoded to 'public'. The fix changes the default schema_name from 'public' to None, which makes the function use the database's search_path configuration. Closes: topoteretes#2291

- Add test_utils.py with tests for copy_model() and get_own_properties() - Tests cover basic copying, include_fields, exclude_fields, and edge cases - Follows existing test structure in cognee/tests/unit/modules/

pull-checklist · 2026-03-09T09:06:22Z

Please make sure all the checkboxes are checked:

I have tested these changes locally.
I have reviewed the code changes.
I have added end-to-end and unit tests (if applicable).
I have updated the documentation and README.md file (if necessary).
I have removed unnecessary code and debug statements.
PR title is clear and follows the convention.
I have tagged reviewers or team members for feedback.

coderabbitai · 2026-03-09T09:06:41Z

Walkthrough

This PR introduces a new Python entrypoint for Docker with database migration and Gunicorn server startup logic, updates the SqlAlchemyAdapter method signature to support multi-schema database operations, adds unit tests for storage utility functions, and reconfigures the Dockerfile to use the new Python entrypoint.

Changes

Cohort / File(s)	Summary
Docker Configuration `Dockerfile`, `entrypoint.py`	Dockerfile updated to copy and execute entrypoint.py instead of entrypoint.sh. New Python entrypoint script added with Alembic migration execution, debugpy integration for development, Gunicorn server launch with UvicornWorker, and environment-based configuration (DEBUG, ENVIRONMENT, ports).
Database Adapter `cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py`	Method signature changed: `delete_entity_by_id` schema_name parameter default shifted from "public" to None to support multi-schema setups via database search_path; docstring updated accordingly.
Storage Utility Tests `cognee/tests/unit/modules/storage/test_utils.py`	New unit test file covering copy_model and get_own_properties functions with scenarios including basic copying, field inclusion/exclusion, and edge cases.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

fix: entrypoint.sh to not fail on first docker up, improved handling of migrations, signals and errors. #546: Modifies container startup and migration/error-handling logic; this PR extends/replaces shell entrypoint with a comprehensive Python entrypoint performing migration and server launch tasks.

Suggested reviewers

Vasilije1990

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Out of Scope Changes check	⚠️ Warning	The PR description focuses on storage utility function tests but the raw_summary includes substantial unrelated changes: Dockerfile Docker entrypoint updates, SqlAlchemyAdapter schema_name parameter changes, and a new entrypoint.py script.	Remove the Dockerfile, SqlAlchemyAdapter, and entrypoint.py changes from this PR and create separate PRs for each distinct feature (Docker entrypoint refactoring and schema handling improvements).
Linked Issues check	❓ Inconclusive	The PR addresses `#2312`'s core testing requirements but the raw_summary shows changes to Dockerfile, SqlAlchemyAdapter, and entrypoint.py that are not mentioned in the PR description and appear unrelated to the storage utility function tests.	Clarify whether the Dockerfile, SqlAlchemyAdapter, and entrypoint.py changes are intentional or mistakenly included in this PR; if unintended, move them to separate PRs focused on their specific objectives.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'Add unit tests for storage utility functions (`#2312`)' clearly and concisely describes the main objective of the changeset, focusing on the primary contribution of new unit tests.
Description check	✅ Passed	The PR description includes a summary, detailed changes, and testing information. While some template sections are omitted, the essential information is present and the description adequately communicates the PR's purpose and scope.
Docstring Coverage	✅ Passed	Docstring coverage is 91.67% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1)
275-288: ⚠️ Potential issue | 🔴 Critical

Fix critical bug: schema_name=None causes full_table_name to become literal string "None.table".

When delete_entity_by_id is called without schema_name, it defaults to None and passes this explicitly to get_table(). In get_table() at line 410, the code constructs:
full_table_name = f"{schema_name}.{table_name}"  # becomes "None.users" (literal string)
This will never match actual table names in metadata.tables, causing EntityNotFoundError on line 414 even when the table exists. The metadata.reflect(schema=None) call works correctly for PostgreSQL search_path, but the subsequent table name lookup fails.

For non-SQLite databases, handle schema_name=None explicitly before constructing full_table_name. When None, either omit the schema prefix or explicitly use "public" (similar to the pattern at line 655).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py`
around lines 275 - 288, delete_entity_by_id is passing schema_name=None into
get_table which builds full_table_name = f"{schema_name}.{table_name}",
producing "None.table" and failing lookup; update the call-site
(delete_entity_by_id) and/or get_table to explicitly handle schema_name is None
by either omitting the schema prefix when building full_table_name or
substituting a real schema like "public" for non-SQLite DBs (mirror the pattern
used around line 655), and ensure get_table never constructs f"{None}.{...}" so
metadata.tables lookups succeed and EntityNotFoundError is not raised
incorrectly.

🧹 Nitpick comments (1)

entrypoint.py (1)

11-25: Consider adding type hints for consistency.

The function lacks type hints. While not critical for an entrypoint script, adding them would align with project guidelines.

💡 Proposed type hints

-def run_command(cmd, cwd=None, check=True):
+def run_command(cmd: list[str], cwd: str | None = None, check: bool = True) -> subprocess.CompletedProcess:
     """Run a shell command and return the result."""

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@entrypoint.py` around lines 11 - 25, Add type hints to run_command: annotate
cmd as Sequence[str], cwd as Optional[Union[str, Path]] = None, and check as
bool = True, and set the return type to subprocess.CompletedProcess[str]; also
import the necessary typing symbols (Optional, Sequence, Union) and Path from
pathlib at the top of the file. Ensure the function signature is updated to def
run_command(cmd: Sequence[str], cwd: Optional[Union[str, Path]] = None, check:
bool = True) -> subprocess.CompletedProcess[str]: and adjust imports
accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cognee/tests/unit/modules/storage/test_utils.py`:
- Around line 95-105: The test incorrectly asserts that belongs_to_set is
excluded even though get_own_properties only excludes lists whose first element
is a DataPoint; update the test to match implementation by either (A) changing
the assertion to assert "belongs_to_set" in properties, or (B) if you intend to
test exclusion of nested DataPoint objects, replace the list value with
DataPoint instances (e.g., belongs_to_set=[DataPoint(...), DataPoint(...)]) so
get_own_properties will exclude it; refer to the DataPoint constructor and
get_own_properties function to make the change.
- Around line 13-24: The test fails because copy_model expects a class (accesses
model.__name__) but the test passes an instance; either update the test to call
copy_model(type(original)) or make copy_model accept instances by normalizing
its input (e.g., in copy_model, detect if the input is a class or instance and
set model_class = model if inspect.isclass(model) else type(model), then use
model_class.__name__ and model_class.model_fields); pick one approach and apply
consistently with other callers like get_model_instance_from_graph,
get_graph_from_model, and LanceDBAdapter.

---

Outside diff comments:
In `@cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py`:
- Around line 275-288: delete_entity_by_id is passing schema_name=None into
get_table which builds full_table_name = f"{schema_name}.{table_name}",
producing "None.table" and failing lookup; update the call-site
(delete_entity_by_id) and/or get_table to explicitly handle schema_name is None
by either omitting the schema prefix when building full_table_name or
substituting a real schema like "public" for non-SQLite DBs (mirror the pattern
used around line 655), and ensure get_table never constructs f"{None}.{...}" so
metadata.tables lookups succeed and EntityNotFoundError is not raised
incorrectly.

---

Nitpick comments:
In `@entrypoint.py`:
- Around line 11-25: Add type hints to run_command: annotate cmd as
Sequence[str], cwd as Optional[Union[str, Path]] = None, and check as bool =
True, and set the return type to subprocess.CompletedProcess[str]; also import
the necessary typing symbols (Optional, Sequence, Union) and Path from pathlib
at the top of the file. Ensure the function signature is updated to def
run_command(cmd: Sequence[str], cwd: Optional[Union[str, Path]] = None, check:
bool = True) -> subprocess.CompletedProcess[str]: and adjust imports
accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 60257173-c4c1-4543-ab6d-35bc7777336b

📥 Commits

Reviewing files that changed from the base of the PR and between c9370a8 and 6edc0b0.

📒 Files selected for processing (5)

Dockerfile
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py
cognee/tests/unit/modules/storage/__init__.py
cognee/tests/unit/modules/storage/test_utils.py
entrypoint.py

coderabbitai · 2026-03-09T09:12:03Z

cognee/tests/unit/modules/storage/test_utils.py

+    def test_copy_model_basic(self):
+        """Test basic DataPoint copying without modifications."""
+        original = DataPoint(
+            id=UUID("12345678-1234-5678-1234-567812345678"),
+            type="TestPoint",
+        )
+
+        copied = copy_model(original)
+
+        assert copied.__name__ == "TestPoint"
+        assert "id" in copied.model_fields
+        assert "type" in copied.model_fields


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check the actual copy_model implementation for __name__ usage ast-grep --pattern $'def copy_model($$$): $$$ '

Repository: topoteretes/cognee

Length of output: 1570

🏁 Script executed:

#!/bin/bash # Also check if __name__ is accessed on class or instance rg -n "__name__" cognee/modules/storage/utils/__init__.py -B2 -A2

Repository: topoteretes/cognee

Length of output: 253

🏁 Script executed:

#!/bin/bash # Check if DataPoint is defined and what it inherits from rg -n "class DataPoint" -A5 cognee/

Repository: topoteretes/cognee

Length of output: 1669

🏁 Script executed:

#!/bin/bash # Check all usages of copy_model to see if it's called with instances or classes rg -n "copy_model\(" cognee/ -B2 -A2 | head -100

Repository: topoteretes/cognee

Length of output: 4626

🏁 Script executed:

#!/bin/bash # Check if BaseModel instances have __name__ by looking at Pydantic usage patterns rg -n "\.\_\_name\_\_" cognee/modules/storage/ -B2 -A2

Repository: topoteretes/cognee

Length of output: 144

The test passes an instance to copy_model() but the implementation expects a class.

The test creates a DataPoint instance and passes it to copy_model(original), but the implementation at line 37 accesses model.__name__, which only exists on classes. When an instance is passed, this will raise AttributeError: 'DataPoint' object has no attribute '__name__'.

All other usages in the codebase pass the class via type(source_node) or similar (see get_model_instance_from_graph.py, get_graph_from_model.py, LanceDBAdapter.py). Either the function signature should accept type[DataPoint] instead of DataPoint, or the tests should pass type(original) instead of original.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee/tests/unit/modules/storage/test_utils.py` around lines 13 - 24, The test fails because copy_model expects a class (accesses model.__name__) but the test passes an instance; either update the test to call copy_model(type(original)) or make copy_model accept instances by normalizing its input (e.g., in copy_model, detect if the input is a class or instance and set model_class = model if inspect.isclass(model) else type(model), then use model_class.__name__ and model_class.model_fields); pick one approach and apply consistently with other callers like get_model_instance_from_graph, get_graph_from_model, and LanceDBAdapter.

coderabbitai · 2026-03-09T09:12:03Z

cognee/tests/unit/modules/storage/test_utils.py

+    def test_get_own_properties_excludes_nested(self):
+        """Test that nested objects are excluded."""
+        data_point = DataPoint(
+            type="TestPoint",
+            belongs_to_set=["item1", "item2"],
+        )
+
+        properties = get_own_properties(data_point)
+
+        # belongs_to_set should be excluded as it's a list with specific types
+        assert "belongs_to_set" not in properties


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the actual get_own_properties implementation rg -n "def get_own_properties" cognee/modules/storage/utils/__init__.py -A 20

Repository: topoteretes/cognee

Length of output: 655

Test assertion is incorrect based on the implementation.

The test expects belongs_to_set to be excluded when set to ["item1", "item2"], but the get_own_properties implementation only excludes lists where the first element is a DataPoint instance. Since "item1" is a string, belongs_to_set will be included in the returned properties, causing this test to fail. Either correct the assertion to check that "belongs_to_set" is in properties, or adjust the test data to use DataPoint instances if the intent is to test exclusion of nested objects.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee/tests/unit/modules/storage/test_utils.py` around lines 95 - 105, The test incorrectly asserts that belongs_to_set is excluded even though get_own_properties only excludes lists whose first element is a DataPoint; update the test to match implementation by either (A) changing the assertion to assert "belongs_to_set" in properties, or (B) if you intend to test exclusion of nested DataPoint objects, replace the list value with DataPoint instances (e.g., belongs_to_set=[DataPoint(...), DataPoint(...)]) so get_own_properties will exclude it; refer to the DataPoint constructor and get_own_properties function to make the change.

Alberto Farah and others added 3 commits March 7, 2026 07:44

Add unit tests for storage utility functions (topoteretes#2312)

6edc0b0

- Add test_utils.py with tests for copy_model() and get_own_properties() - Tests cover basic copying, include_fields, exclude_fields, and edge cases - Follows existing test structure in cognee/tests/unit/modules/

coderabbitai bot reviewed Mar 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unit tests for storage utility functions (#2312)#2338

Add unit tests for storage utility functions (#2312)#2338
haroldfabla2-hue wants to merge 3 commits intotopoteretes:mainfrom
haroldfabla2-hue:test/issue-2312-storage-utils

haroldfabla2-hue commented Mar 9, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

pull-checklist bot commented Mar 9, 2026

Uh oh!

coderabbitai bot commented Mar 9, 2026 •

edited

Loading

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 9, 2026

Uh oh!

coderabbitai bot Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

haroldfabla2-hue commented Mar 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Summary by CodeRabbit

Uh oh!

pull-checklist bot commented Mar 9, 2026

Please make sure all the checkboxes are checked:

Uh oh!

coderabbitai bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

haroldfabla2-hue commented Mar 9, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 9, 2026 •

edited

Loading