Skip to content

Conversation

@pazone
Copy link
Contributor

@pazone pazone commented Nov 8, 2025

Description

Resolve issue with cypher search by encoding the return value from the cypher query into JSON. Uses fastapi json encoder

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring
  • Performance improvement
  • Other (please specify):

Screenshots/Videos (if applicable)

Example of result now with Cypher search with the following query "MATCH (src)-[rel]->(nbr) RETURN src, rel" on Simple example:

{
   "search_result":[
      [
         [
            {
               "_id":{
                  "offset":0,
                  "table":0
               },
               "_label":"Node",
               "id":"87372381-a9fe-5b82-9c92-3f5dbab1bc35",
               "name":"",
               "type":"DocumentChunk",
               "created_at":"2025-11-05T14:12:46.707597",
               "updated_at":"2025-11-05T14:12:54.801747",
               "properties":"{\"created_at\": 1762351945009, \"updated_at\": 1762351945009, \"ontology_valid\": false, \"version\": 1, \"topological_rank\": 0, \"metadata\": {\"index_fields\": [\"text\"]}, \"belongs_to_set\": null, \"text\": \"\\n    Natural language processing (NLP) is an interdisciplinary\\n    subfield of computer science and information retrieval.\\n    \", \"chunk_size\": 48, \"chunk_index\": 0, \"cut_type\": \"paragraph_end\"}"
            },
            {
               "_src":{
                  "offset":0,
                  "table":0
               },
               "_dst":{
                  "offset":1,
                  "table":0
               },
               "_label":"EDGE",
               "_id":{
                  "offset":0,
                  "table":1
               },
               "relationship_name":"contains",
               "created_at":"2025-11-05T14:12:47.217590",
               "updated_at":"2025-11-05T14:12:55.193003",
               "properties":"{\"source_node_id\": \"87372381-a9fe-5b82-9c92-3f5dbab1bc35\", \"target_node_id\": \"bc338a39-64d6-549a-acec-da60846dd90d\", \"relationship_name\": \"contains\", \"updated_at\": \"2025-11-05 14:12:54\", \"relationship_type\": \"contains\", \"edge_text\": \"relationship_name: contains; entity_name: natural language processing (nlp); entity_description: An interdisciplinary subfield of computer science and information retrieval concerned with interactions between computers and human (natural) languages.\"}"
            }
         ]
      ]
   ],
   "dataset_id":"UUID(""af4b1c1c-90fc-59b7-952c-1da9bbde370c"")",
   "dataset_name":"main_dataset",
   "graphs":"None"
}

Relates to #1725 Issue: #1723

Pre-submission Checklist

  • I have tested my changes thoroughly before submitting this PR
  • This PR contains minimal changes necessary to address the issue/feature
  • My code follows the project's coding standards and style guidelines
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if applicable)
  • All new and existing tests pass
  • I have searched existing PRs to ensure this change hasn't been submitted already
  • I have linked any relevant issues in the description
  • My commits have clear and descriptive messages

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

Description

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring
  • Performance improvement
  • Other (please specify):

Screenshots/Videos (if applicable)

Pre-submission Checklist

  • I have tested my changes thoroughly before submitting this PR
  • This PR contains minimal changes necessary to address the issue/feature
  • My code follows the project's coding standards and style guidelines
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if applicable)
  • All new and existing tests pass
  • I have searched existing PRs to ensure this change hasn't been submitted already
  • I have linked any relevant issues in the description
  • My commits have clear and descriptive messages

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

<!-- .github/pull_request_template.md -->

## Description
Resolve issue with cypher search by encoding the return value from the
cypher query into JSON. Uses fastapi json encoder


## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
Example of result now with Cypher search with the following query "MATCH
(src)-[rel]->(nbr) RETURN src, rel" on Simple example:
```
{
   "search_result":[
      [
         [
            {
               "_id":{
                  "offset":0,
                  "table":0
               },
               "_label":"Node",
               "id":"87372381-a9fe-5b82-9c92-3f5dbab1bc35",
               "name":"",
               "type":"DocumentChunk",
               "created_at":"2025-11-05T14:12:46.707597",
               "updated_at":"2025-11-05T14:12:54.801747",
               "properties":"{\"created_at\": 1762351945009, \"updated_at\": 1762351945009, \"ontology_valid\": false, \"version\": 1, \"topological_rank\": 0, \"metadata\": {\"index_fields\": [\"text\"]}, \"belongs_to_set\": null, \"text\": \"\\n    Natural language processing (NLP) is an interdisciplinary\\n    subfield of computer science and information retrieval.\\n    \", \"chunk_size\": 48, \"chunk_index\": 0, \"cut_type\": \"paragraph_end\"}"
            },
            {
               "_src":{
                  "offset":0,
                  "table":0
               },
               "_dst":{
                  "offset":1,
                  "table":0
               },
               "_label":"EDGE",
               "_id":{
                  "offset":0,
                  "table":1
               },
               "relationship_name":"contains",
               "created_at":"2025-11-05T14:12:47.217590",
               "updated_at":"2025-11-05T14:12:55.193003",
               "properties":"{\"source_node_id\": \"87372381-a9fe-5b82-9c92-3f5dbab1bc35\", \"target_node_id\": \"bc338a39-64d6-549a-acec-da60846dd90d\", \"relationship_name\": \"contains\", \"updated_at\": \"2025-11-05 14:12:54\", \"relationship_type\": \"contains\", \"edge_text\": \"relationship_name: contains; entity_name: natural language processing (nlp); entity_description: An interdisciplinary subfield of computer science and information retrieval concerned with interactions between computers and human (natural) languages.\"}"
            }
         ]
      ]
   ],
   "dataset_id":"UUID(""af4b1c1c-90fc-59b7-952c-1da9bbde370c"")",
   "dataset_name":"main_dataset",
   "graphs":"None"
}
```
Relates to #1725
Issue: #1723

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
@pull-checklist
Copy link

pull-checklist bot commented Nov 8, 2025

Please make sure all the checkboxes are checked:

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have added end-to-end and unit tests (if applicable).
  • I have updated the documentation and README.md file (if necessary).
  • I have removed unnecessary code and debug statements.
  • PR title is clear and follows the convention.
  • I have tagged reviewers or team members for feedback.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 8, 2025

Walkthrough

The retrieval module's query results are now JSON-serializable. A new import of jsonable_encoder from fastapi.encoders wraps the graph engine query output to ensure proper encoding without affecting existing control flow or error handling.

Changes

Cohort / File(s) Summary
JSON Serialization Enhancement
cognee/modules/retrieval/cypher_search_retriever.py
Added import of jsonable_encoder from fastapi.encoders; wrapped graph_engine.query(query) result with jsonable_encoder for JSON serialization.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰 A tiny hop, a gentle wrap,
JSON flows without a gap,
The encoder guards the data's way,
Serialized and bright today!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Cherry-pick: Fix cypher search (#1739)' is related to the changeset and refers to fixing a bug in cypher search, which matches the bug fix nature of the changes, though it could be more descriptive.
Description check ✅ Passed The description includes required sections (Description, Type of Change, Pre-submission Checklist, DCO Affirmation) with the bug fix clearly identified, a detailed example output provided, and relevant issue/PR references linked, though most pre-submission checklist items remain unchecked.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix_cypher_cherrypick

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26c18e9 and 007c7d4.

📒 Files selected for processing (1)
  • cognee/modules/retrieval/cypher_search_retriever.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
{cognee,cognee-mcp,distributed,examples,alembic}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

{cognee,cognee-mcp,distributed,examples,alembic}/**/*.py: Use 4-space indentation; name modules and functions in snake_case; name classes in PascalCase (Python)
Adhere to ruff rules, including import hygiene and configured line length (100)
Keep Python lines ≤ 100 characters

Files:

  • cognee/modules/retrieval/cypher_search_retriever.py
cognee/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

cognee/**/*.py: Public APIs in the core library should be type-annotated where practical
Prefer explicit, structured error handling and use shared logging utilities from cognee.shared.logging_utils

Files:

  • cognee/modules/retrieval/cypher_search_retriever.py
🧬 Code graph analysis (1)
cognee/modules/retrieval/cypher_search_retriever.py (2)
cognee/infrastructure/databases/graph/kuzu/adapter.py (1)
  • query (210-278)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (1)
  • query (100-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: CLI Tests / CLI Integration Tests
  • GitHub Check: CLI Tests / CLI Functionality Tests
  • GitHub Check: End-to-End Tests / Test Entity Extraction
  • GitHub Check: End-to-End Tests / Conversation sessions test
  • GitHub Check: End-to-End Tests / Test permissions with different situations in Cognee
  • GitHub Check: End-to-End Tests / Test Feedback Enrichment
  • GitHub Check: End-to-End Tests / Concurrent Subprocess access test
  • GitHub Check: End-to-End Tests / Test graph edge ingestion
  • GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
  • GitHub Check: End-to-End Tests / S3 Bucket Test
  • GitHub Check: End-to-End Tests / Server Start Test
  • GitHub Check: End-to-End Tests / Deduplication Test
  • GitHub Check: Basic Tests / Run Integration Tests
  • GitHub Check: Basic Tests / Run Simple Examples
  • GitHub Check: Basic Tests / Run Unit Tests
🔇 Additional comments (2)
cognee/modules/retrieval/cypher_search_retriever.py (2)

2-3: FastAPI is already a core project dependency; this change does not introduce new coupling.

The concern in the original review comment is based on an incorrect premise. FastAPI is already declared as a core dependency in pyproject.toml ("fastapi>=0.116.2,<1.0.0"), and CypherSearchRetriever is already integrated into the search pipeline that serves API endpoints. Using jsonable_encoder here is appropriate for ensuring query results are JSON-serializable before being returned through the API.

Likely an incorrect or invalid review comment.


55-55: Type annotation improvement recommended; JSON encoding change is appropriate.

The jsonable_encoder wrapping is correct for API serialization. Graph engines return List[Dict[str, Any]] (Neo4j) or List[Tuple] (Kuzu), and encoding ensures consistent JSON-serializable output for API consumers.

However, improve the return type annotation from Any to List[Dict[str, Any]] to accurately reflect the JSON-encoded output structure and provide clarity to callers.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Vasilije1990 Vasilije1990 merged commit 7557e6f into main Nov 9, 2025
137 of 142 checks passed
@Vasilije1990 Vasilije1990 deleted the fix_cypher_cherrypick branch November 9, 2025 05:48
Varming73 pushed a commit to Varming73/cognee that referenced this pull request Nov 10, 2025
Merged upstream changes from official Cognee repository:
- Fix Cypher search JSON encoding (topoteretes#1739, topoteretes#1763)
- Update Ollama API to use OpenAI-compatible embeddings
- Update Python version range documentation
- Version bump to v0.4.0

Changes applied:
- README.md: Python version range update
- cognee/modules/retrieval/cypher_search_retriever.py: JSON encoding fix
- pyproject.toml: Version bump to 0.4.0
- uv.lock: Regenerated with updated dependencies
- CLAUDE.md: Updated sync metadata

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants