-
Notifications
You must be signed in to change notification settings - Fork 998
Update get_cognify_router.py with graph endpoint #372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe pull request introduces modifications to the Changes
Suggested labels
Suggested reviewers
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (1)
cognee/api/v1/cognify/routers/get_cognify_router.py (1)
18-18: Enhance endpoint documentationThe current docstring could be more descriptive about the new graph functionality and parameters.
Consider expanding the docstring:
- """ This endpoint is responsible for the cognitive processing of the content.""" + """Process content cognitively with graph model support. + + Args: + payload (CognifyPayloadDTO): Contains datasets and graph model configuration + graph_model (BaseModel): Graph processing model configuration + user (User): Authenticated user information + + Returns: + JSONResponse: Success response or error details with appropriate status code + """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cognee/api/v1/cognify/routers/get_cognify_router.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
cognee/api/v1/cognify/routers/get_cognify_router.py
17-17: SyntaxError: Parameter without a default cannot follow a parameter with a default
| try: | ||
| await cognee_cognify(payload.datasets, user) | ||
| await cognee_cognify(payload.datasets, user, payload.graph_model) | ||
| except Exception as error: | ||
| return JSONResponse( | ||
| status_code=409, | ||
| content={"error": str(error)} | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling and add success response
The current implementation has several concerns:
- All exceptions result in a 409 status code, which might not be appropriate for all error cases
- No explicit success response is defined
Consider this improved implementation:
try:
await cognee_cognify(payload.datasets, user, payload.graph_model)
+ return JSONResponse(
+ status_code=200,
+ content={"message": "Cognitive processing completed successfully"}
+ )
except Exception as error:
+ # Log the error for debugging
+ error_code = getattr(error, 'status_code', 409)
return JSONResponse(
- status_code=409,
+ status_code=error_code,
content={"error": str(error)}
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| await cognee_cognify(payload.datasets, user) | |
| await cognee_cognify(payload.datasets, user, payload.graph_model) | |
| except Exception as error: | |
| return JSONResponse( | |
| status_code=409, | |
| content={"error": str(error)} | |
| ) | |
| try: | |
| await cognee_cognify(payload.datasets, user, payload.graph_model) | |
| return JSONResponse( | |
| status_code=200, | |
| content={"message": "Cognitive processing completed successfully"} | |
| ) | |
| except Exception as error: | |
| # Log the error for debugging | |
| error_code = getattr(error, 'status_code', 409) | |
| return JSONResponse( | |
| status_code=error_code, | |
| content={"error": str(error)} | |
| ) |
|
|
||
| @router.post("/", response_model=None) | ||
| async def cognify(payload: CognifyPayloadDTO, user: User = Depends(get_authenticated_user)): | ||
| async def cognify(payload: CognifyPayloadDTO, user: User = Depends(get_authenticated_user), graph_model:BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix parameter ordering to comply with Python syntax
Parameters with default values must come after parameters without defaults. Also, consider adding a more specific response model type.
Apply this diff to fix the parameter ordering:
- async def cognify(payload: CognifyPayloadDTO, user: User = Depends(get_authenticated_user), graph_model:BaseModel):
+ async def cognify(payload: CognifyPayloadDTO, graph_model: BaseModel, user: User = Depends(get_authenticated_user)):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def cognify(payload: CognifyPayloadDTO, user: User = Depends(get_authenticated_user), graph_model:BaseModel): | |
| async def cognify(payload: CognifyPayloadDTO, graph_model: BaseModel, user: User = Depends(get_authenticated_user)): |
🧰 Tools
🪛 Ruff (0.8.2)
17-17: SyntaxError: Parameter without a default cannot follow a parameter with a default
| class CognifyPayloadDTO(BaseModel): | ||
| datasets: List[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add missing graph_model field to CognifyPayloadDTO
The DTO is missing the graph_model field which is being accessed in the implementation. This will cause runtime errors.
Apply this diff to add the required field:
class CognifyPayloadDTO(BaseModel):
datasets: List[str]
+ graph_model: BaseModelCommittable suggestion skipped: line range outside the PR's diff.
| from cognee.api.v1.cognify.cognify_v2 import cognify as cognee_cognify | ||
| try: | ||
| await cognee_cognify(payload.datasets, user) | ||
| await cognee_cognify(payload.datasets, user, payload.graph_model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Function signature mismatch detected
The cognee_cognify function in cognify_v2.py has a different signature than how it's being called:
- Implementation:
async def cognify(datasets, user, graph_model=KnowledgeGraph) - Usage:
await cognee_cognify(payload.datasets, user, payload.graph_model)
The implementation has graph_model as an optional parameter with default value KnowledgeGraph, but it's being passed payload.graph_model which doesn't exist in the CognifyPayloadDTO class (which only has datasets field).
cognee/api/v1/cognify/routers/get_cognify_router.py: UpdateCognifyPayloadDTOto include thegraph_modelfieldcognee/api/v1/cognify/routers/get_cognify_router.py: Remove the separategraph_modelparameter from the route handler as it's redundant
🔗 Analysis chain
Verify cognee_cognify signature and usage
Let's verify that the cognee_cognify function is updated to accept the new graph_model parameter and check for any other endpoints using CognifyPayloadDTO.
Let me gather more information about the cognee_cognify function implementation.
Let me check the implementation of the imported cognify function from cognify_v2.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check cognee_cognify implementation
ast-grep --pattern 'def cognify($$$, $graph_model, $$$):'
# Check for other usages of CognifyPayloadDTO
rg -l "CognifyPayloadDTO"
Length of output: 137
Script:
#!/bin/bash
# Check the implementation of cognee_cognify function
rg "async def cognee_cognify" -A 5
# Check the CognifyPayloadDTO definition
rg "class CognifyPayloadDTO" -A 10
Length of output: 1115
Script:
#!/bin/bash
# Check the implementation in cognify_v2.py
rg "async def cognify" cognee/api/v1/cognify/cognify_v2.py -A 5
# Also check if there are any other files importing CognifyPayloadDTO
rg "from.*CognifyPayloadDTO import"
rg "from.*get_cognify_router import.*CognifyPayloadDTO"
Length of output: 395
|
|
||
| @router.post("/", response_model=None) | ||
| async def cognify(payload: CognifyPayloadDTO, user: User = Depends(get_authenticated_user)): | ||
| async def cognify(payload: CognifyPayloadDTO, user: User = Depends(get_authenticated_user), graph_model:BaseModel=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If graph_model is part of the CognifyPayloadDTO I don't think we need it as an input parameter in the endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
|
||
| class CognifyPayloadDTO(BaseModel): | ||
| datasets: List[str] | ||
| graph_model: BaseModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to have graph_model be optional we should add the type in an Optional wrapper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (5)
.github/workflows/test_neo4j.yml (2)
18-18: Maintain consistent job property ordering across workflow files
- The
ifcondition is placed beforeruns-on, while other workflow files place it after. Consider maintaining consistent ordering across all workflow files for better maintainability.- This file also has the same issues as noted in test_qdrant.yml regarding conditional execution.
Consider updating the job definition to match the ordering in other files:
run_neo4j_integration_test: name: test - if: ${{ github.event.label.name == 'run-checks' }} runs-on: ubuntu-latest + if: ${{ github.event_name == 'workflow_dispatch' || github.event.label.name == 'run-checks' }}
Line range hint
1-1: Consider documenting and standardizing the label-based workflow controlThe introduction of label-based control for test execution is a good approach to optimize CI resources. Consider the following improvements:
Add documentation in the repository explaining:
- The purpose and usage of the
run-checkslabel- When and how to apply the label
- The impact on different types of workflows
Consider creating a reusable workflow for the common job configuration to reduce duplication across these similar workflow files.
Would you like me to help create a reusable workflow and update the documentation?
.github/workflows/test_milvus.yml (1)
21-21: Simplify the conditional syntax.The condition can be simplified by removing the explicit expression syntax:
- if: ${{ github.event.label.name == 'run-checks' }} + if: github.event.label.name == 'run-checks'.github/workflows/test_pgvector.yml (1)
Line range hint
1-21: Consider using reusable workflows to reduce duplication.Multiple workflow files share similar configuration (triggers, concurrency, label condition). Consider creating a reusable workflow to maintain these common patterns.
Example implementation:
# .github/workflows/reusable-test.yml name: Reusable Test Workflow on: workflow_call: inputs: test-command: required: true type: string secrets: llm-api-key: required: true jobs: run_test: if: contains(github.event.pull_request.labels.*.name, 'run-checks') runs-on: ubuntu-latest steps: # ... common setup steps ... - name: Run Test env: LLM_API_KEY: ${{ secrets.llm-api-key }} run: ${{ inputs.test-command }}Usage:
# .github/workflows/test_pgvector.yml name: test | pgvector on: workflow_dispatch: pull_request: types: [labeled, unlabeled, synchronize] jobs: test: uses: ./.github/workflows/reusable-test.yml with: test-command: poetry run python ./cognee/tests/test_pgvector.py secrets: llm-api-key: ${{ secrets.OPENAI_API_KEY }}.github/workflows/test_deduplication.yml (1)
Line range hint
1-20: Consider implementing a comprehensive CI/CD strategy.The current approach of using labels to control test execution across multiple workflows suggests a need for a more structured CI/CD strategy.
Consider these improvements:
- Test Matrix Strategy:
- Create a central workflow that defines a test matrix
- Use job dependencies to control test flow
- Example:
jobs: determine_test_suite: runs-on: ubuntu-latest outputs: matrix: ${{ steps.set-matrix.outputs.matrix }} steps: - id: set-matrix run: | if contains(github.event.pull_request.labels.*.name, 'run-checks'); then echo "matrix={\"test\":[\"milvus\",\"pgvector\",\"deduplication\"]}" >> $GITHUB_OUTPUT else echo "matrix={\"test\":[]}" >> $GITHUB_OUTPUT fi run_tests: needs: determine_test_suite strategy: matrix: ${{ fromJson(needs.determine_test_suite.outputs.matrix) }} uses: ./.github/workflows/reusable-test.yml with: test-type: ${{ matrix.test }}
Environment Protection Rules:
- Instead of labels, use GitHub Environments with protection rules
- This provides better audit trails and approval workflows
Workflow Optimization:
- Use dependency caching for faster builds
- Implement test splitting for parallel execution
- Add timeout policies to prevent hung jobs
Would you like me to create a detailed implementation plan for any of these suggestions?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
.github/workflows/test_deduplication.yml(1 hunks).github/workflows/test_milvus.yml(1 hunks).github/workflows/test_neo4j.yml(1 hunks).github/workflows/test_pgvector.yml(1 hunks).github/workflows/test_python_3_10.yml(1 hunks).github/workflows/test_python_3_11.yml(1 hunks).github/workflows/test_python_3_9.yml(1 hunks).github/workflows/test_qdrant.yml(1 hunks).github/workflows/test_weaviate.yml(1 hunks)
🔇 Additional comments (6)
.github/workflows/test_weaviate.yml (1)
21-21: Same issues as in test_qdrant.yml
.github/workflows/test_milvus.yml (1)
Line range hint 4-6: Verify label removal handling.
The current setup might miss running tests when:
- A PR is labeled with 'run-checks'
- Changes are pushed
- The label is removed
- More changes are pushed
The job won't run on subsequent pushes until the label is re-added.
Consider one of these alternatives:
- Add 'unlabeled' to the trigger events and adjust the condition to check if the label exists:
on:
workflow_dispatch:
pull_request:
- types: [labeled, synchronize]
+ types: [labeled, unlabeled, synchronize]
jobs:
run_milvus:
- if: github.event.label.name == 'run-checks'
+ if: contains(github.event.pull_request.labels.*.name, 'run-checks')- Or use GitHub Actions environments with protection rules to control when tests run.
Also applies to: 21-21
.github/workflows/test_pgvector.yml (1)
21-21: Simplify the conditional syntax.
The condition can be simplified by removing the explicit expression syntax, as mentioned in the review of test_milvus.yml.
.github/workflows/test_deduplication.yml (1)
20-20: Simplify the conditional syntax.
The condition can be simplified by removing the explicit expression syntax, as mentioned in the review of test_milvus.yml.
.github/workflows/test_python_3_10.yml (1)
21-21: Same issue as in Python 3.9 workflow
This condition has the same robustness concerns as identified in the Python 3.9 workflow file.
Please apply the same fix as suggested in the Python 3.9 workflow review.
.github/workflows/test_python_3_11.yml (1)
21-21: Same issue as in other Python workflows
This condition has the same robustness concerns as identified in the other workflow files.
Consider creating a reusable workflow to:
- Eliminate code duplication across workflow files
- Centralize the conditional logic
- Make future maintenance easier
Example approach:
- Create
.github/workflows/reusable-python-test.ymlwith the test job - Reference it from version-specific workflows using
usesandwithto pass the Python version
Would you like me to provide an example implementation of the reusable workflow?
| run_qdrant_integration_test: | ||
| name: test | ||
| runs-on: ubuntu-latest | ||
| if: ${{ github.event.label.name == 'run-checks' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider improving the conditional execution configuration
- The current condition might prevent tests from running on manual workflow dispatch (
workflow_dispatchevent). - There's inconsistent spacing after
if:.
Consider updating the condition to handle both labeled PRs and manual triggers:
- if: ${{ github.event.label.name == 'run-checks' }}
+ if: ${{ github.event_name == 'workflow_dispatch' || github.event.label.name == 'run-checks' }}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if: ${{ github.event.label.name == 'run-checks' }} | |
| if: ${{ github.event_name == 'workflow_dispatch' || github.event.label.name == 'run-checks' }} |
| run_common: | ||
| name: test | ||
| runs-on: ubuntu-latest | ||
| if: ${{ github.event.label.name == 'run-checks' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve robustness of the conditional check
The current condition might fail when accessing label.name during 'synchronize' events where no label exists. Consider using a more robust condition that handles both trigger events appropriately.
Replace with this safer condition:
- if: ${{ github.event.label.name == 'run-checks' }}
+ if: |
+ github.event_name == 'workflow_dispatch' ||
+ (github.event_name == 'pull_request' &&
+ (github.event.action == 'labeled' && github.event.label.name == 'run-checks' ||
+ github.event.action == 'synchronize' && contains(github.event.pull_request.labels.*.name, 'run-checks')))This improved condition:
- Allows manual workflow dispatch
- Properly handles both 'labeled' and 'synchronize' events
- Safely checks for the 'run-checks' label
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if: ${{ github.event.label.name == 'run-checks' }} | |
| if: | | |
| github.event_name == 'workflow_dispatch' || | |
| (github.event_name == 'pull_request' && | |
| (github.event.action == 'labeled' && github.event.label.name == 'run-checks' || | |
| github.event.action == 'synchronize' && contains(github.event.pull_request.labels.*.name, 'run-checks'))) |
Summary by CodeRabbit
New Features
graph_model, to the cognify function, enhancing its functionality.Workflow Improvements
run-checkslabel is present on the pull request.