-
Notifications
You must be signed in to change notification settings - Fork 966
Changes Neo4j add_edge method and implements unit tests around get_graph_from_model logic #474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes Neo4j add_edge method and implements unit tests around get_graph_from_model logic #474
Conversation
WalkthroughThis pull request focuses on removing the Changes
Sequence DiagramsequenceDiagram
participant API as Cognify API
participant Tasks as Add Data Points Task
participant GraphUtils as Graph Utilities
participant Database as Neo4j Database
API->>Tasks: Invoke add_data_points
Tasks->>GraphUtils: Get graph from model
GraphUtils-->>Tasks: Return graph nodes and edges
Tasks->>Database: Add nodes and edges
Database-->>Tasks: Confirm addition
Possibly related PRs
Suggested labels
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (21)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…nsive-checks-around-the
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (1)
208-221: Consider indexing for performance when usingMATCHwithUNWIND.These lines create or merge a large number of relationships via
apoc.merge.relationship. Ifedge.from_nodeandedge.to_nodefields are not indexed in Neo4j, the repeatedMATCH (from_node {id: edge.from_node})andMATCH (to_node {id: edge.to_node})operations underUNWINDcould degrade performance for large lists of edges. Ensuring proper indexing on theidproperty will help optimize these queries.cognee/tests/unit/interfaces/graph/get_graph_from_model_unit_tests.py (1)
8-8: Remove unused import.The
is_typeimport fromIPython.utils.wildcardis not used in any of the test cases.-from IPython.utils.wildcard import is_type
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
cognee/api/v1/cognify/cognify_v2.py(1 hunks)cognee/infrastructure/databases/graph/neo4j_driver/adapter.py(1 hunks)cognee/modules/graph/utils/get_graph_from_model.py(1 hunks)cognee/tasks/storage/add_data_points.py(1 hunks)cognee/tests/unit/interfaces/graph/get_graph_from_model_unit_tests.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: Publish Cognee Docker image
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: windows-latest
- GitHub Check: docker-compose-test
🔇 Additional comments (7)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (1)
208-221: Verify the newapoc.merge.relationshiplogic.Previously, if this code used
apoc.create.relationship, changing it toapoc.merge.relationshipcan alter behavior for existing relationships. Specifically,apoc.merge.relationshipwill merge properties on any matched relationship, rather than creating duplicates. If this is intentional (avoiding duplicates), it’s a beneficial move; otherwise, re-check whether you prefer always creating new relationships vs. merging.cognee/tasks/storage/add_data_points.py (1)
8-8: Confirm backward compatibility after removingonly_root.Removing the
only_rootparameter fromadd_data_pointsmay affect code depending on that argument. If any callers passonly_rootor expect root-only behavior, this could introduce unexpected runtime errors or logic changes. Double-check all call sites to confirm they do not rely on that parameter.cognee/modules/graph/utils/get_graph_from_model.py (1)
100-100: Validate thatfield_valuealways has anidattribute.At line 100, the check depends on
field_value.id. Ensure thatfield_valueis indeed a DataPoint or a compatible type with theidattribute. If not, this code could raise runtime errors or skip creating nodes for valid objects. Consider adding input validations or type hints to make assumptions explicit.As a follow-up, you can run a script to confirm that all
field_valueinstances passed here areDataPointor subtypes:✅ Verification successful
The code already validates that
field_valuehas anidattributeThe implementation includes proper type checking through
isinstance(field_value, DataPoint)before accessing theidattribute, ensuring type safety at runtime. This validation prevents any potential attribute errors.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Use ast-grep to find calls to get_graph_from_model and examine argument types. ast-grep --pattern $'await get_graph_from_model($_, $$$)'Length of output: 2472
Script:
#!/bin/bash # Get function definition and surrounding context ast-grep --pattern 'async def get_graph_from_model($$$)' -A 10 # Get context around the id access rg "field_value.id" -B 5 -A 5 # Get test setup to see field_value types rg "field_value.*=|data_point.*=" cognee/tests/unit/interfaces/graph/get_graph_from_model_unit_tests.py -B 2 -A 2Length of output: 5531
cognee/api/v1/cognify/cognify_v2.py (1)
163-163: Verify the impact of removingonly_rootparameter.The removal of
only_root=Truefrom theadd_data_pointstask configuration aligns with the broader refactoring effort. However, we should ensure this change doesn't affect the existing data processing behavior.Run the following script to check for any remaining usages of
only_rootin the task configuration and related files:✅ Verification successful
Removal of
only_rootparameter is safeThe parameter has been consistently removed across all task configurations in the codebase, and all similar tasks follow the same pattern without using this parameter. The change aligns with the broader refactoring effort and maintains consistency with other pipeline configurations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for any remaining references to only_root in task configurations and related files rg -g '!*.pyc' -g '!*test*' 'only_root' .Length of output: 41
Script:
#!/bin/bash # Find Task class definition and similar configurations echo "=== Searching for Task class definition ===" ast-grep --pattern 'class Task' echo -e "\n=== Searching for task configurations ===" rg -g '!*.pyc' -g '!*test*' 'Task\(' -A 2Length of output: 6869
cognee/tests/unit/interfaces/graph/get_graph_from_model_unit_tests.py (3)
17-32: LGTM! Well-structured test case.The basic initialization test is well-organized with clear setup, execution, and assertions. The docstring effectively describes the test's purpose.
34-62: LGTM! Comprehensive test for single neighbor relationship.The test effectively validates both node creation and edge formation for a simple entity-type relationship.
64-121: LGTM! Thorough test for complex nested relationships.The test comprehensively covers multiple aspects:
- Multiple entity nodes with shared type
- Document and chunk relationships
- Multiple edge types (is_a, is_part_of, contains)
cognee/tests/unit/interfaces/graph/get_graph_from_model_unit_tests.py
Outdated
Show resolved
Hide resolved
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_get_graph_from_model_only_root(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If only_root is removed from the method, let's remove the test as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this PR is not finished, first I wanted to test it then I found a bug and that's why I changed the Adapter because instead of debugging the get_graph_from_model its a better solution in general I think.
Description
This PR includes extensive unit tests around the get_graph_from_model logic and changes the neo4j add_edges logic.
DCO Affirmation
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Refactor
only_rootparameter from various functions.Tests
get_graph_from_modelfunction, covering basic initialization, single neighbor, and multiple nested connections scenarios.Chores