-
Notifications
You must be signed in to change notification settings - Fork 967
Removed check_permissions_on_dataset.py and related references #1786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removed check_permissions_on_dataset.py and related references #1786
Conversation
Please make sure all the checkboxes are checked:
|
WalkthroughThis PR removes the Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @martin0731, thank you for submitting a PR! We will respond as soon as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
cognee/api/v1/cognify/cognify.py (2)
244-294: Consider removing the unuseduserparameter.The removal of the permissions check task is correct. However, the
userparameter is no longer used withinget_default_tasks()but remains in the function signature. While keeping it might provide backward compatibility or future extensibility, consider removing it to clarify the function's actual dependencies.If you decide to clean up the signature, apply this diff:
async def get_default_tasks( # TODO: Find out a better way to do this (Boris's comment) - user: User = None, graph_model: BaseModel = KnowledgeGraph, chunker=TextChunker, chunk_size: int = None,And update the call site in
cognify()at line 217:tasks = await get_default_tasks( - user=user, graph_model=graph_model, chunker=chunker,
297-332: Consider removing the unuseduserparameter for consistency.The removal of the permissions check task and docstring updates are correct. Similar to
get_default_tasks(), theuserparameter is no longer used withinget_temporal_tasks(). For consistency and clarity, consider removing it from the signature as well.If you decide to clean up the signature, apply this diff:
async def get_temporal_tasks( - user: User = None, chunker=TextChunker, chunk_size: int = None, chunks_per_batch: int = 10 + chunker=TextChunker, chunk_size: int = None, chunks_per_batch: int = 10 ) -> list[Task]: """ Builds and returns a list of temporal processing tasks to be executed in sequence. The pipeline includes: 1. Document classification. 2. Document chunking with a specified or default chunk size. 3. Event and timestamp extraction from chunks. 4. Knowledge graph extraction from events. 5. Batched insertion of data points. Args: - user (User, optional): The user requesting task execution. chunker (Callable, optional): A text chunking function/class to split documents. Defaults to TextChunker.And update the call site in
cognify()at line 213:tasks = await get_temporal_tasks( - user=user, chunker=chunker, chunk_size=chunk_size, chunks_per_batch=chunks_per_batch + chunker=chunker, chunk_size=chunk_size, chunks_per_batch=chunks_per_batch )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
cognee/api/v1/cognify/cognify.py(2 hunks)cognee/eval_framework/corpus_builder/task_getters/get_cascade_graph_tasks.py(0 hunks)cognee/eval_framework/corpus_builder/task_getters/get_default_tasks_by_indices.py(2 hunks)cognee/tasks/documents/__init__.py(0 hunks)cognee/tasks/documents/check_permissions_on_dataset.py(0 hunks)examples/python/simple_example.py(1 hunks)notebooks/cognee_demo.ipynb(1 hunks)
🔥 Files not summarized due to errors (1)
- notebooks/cognee_demo.ipynb: Error: Server error: no LLM provider could handle the message
💤 Files with no reviewable changes (3)
- cognee/eval_framework/corpus_builder/task_getters/get_cascade_graph_tasks.py
- cognee/tasks/documents/check_permissions_on_dataset.py
- cognee/tasks/documents/init.py
🧰 Additional context used
📓 Path-based instructions (3)
{cognee,cognee-mcp,distributed,examples,alembic}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
{cognee,cognee-mcp,distributed,examples,alembic}/**/*.py: Use 4-space indentation; name modules and functions in snake_case; name classes in PascalCase (Python)
Adhere to ruff rules, including import hygiene and configured line length (100)
Keep Python lines ≤ 100 characters
Files:
cognee/eval_framework/corpus_builder/task_getters/get_default_tasks_by_indices.pycognee/api/v1/cognify/cognify.pyexamples/python/simple_example.py
cognee/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
cognee/**/*.py: Public APIs in the core library should be type-annotated where practical
Prefer explicit, structured error handling and use shared logging utilities from cognee.shared.logging_utils
Files:
cognee/eval_framework/corpus_builder/task_getters/get_default_tasks_by_indices.pycognee/api/v1/cognify/cognify.py
examples/python/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
When adding public APIs, provide or update targeted examples under examples/python/
Files:
examples/python/simple_example.py
🧠 Learnings (2)
📚 Learning: 2024-11-13T14:55:05.912Z
Learnt from: 0xideas
Repo: topoteretes/cognee PR: 205
File: cognee/tests/unit/processing/chunks/chunk_by_paragraph_test.py:7-7
Timestamp: 2024-11-13T14:55:05.912Z
Learning: When changes are made to the chunking implementation in `cognee/tasks/chunks`, the ground truth values in the corresponding tests in `cognee/tests/unit/processing/chunks` need to be updated accordingly.
Applied to files:
cognee/eval_framework/corpus_builder/task_getters/get_default_tasks_by_indices.py
📚 Learning: 2024-10-07T11:20:44.876Z
Learnt from: borisarzentar
Repo: topoteretes/cognee PR: 144
File: cognee/tasks/chunking/query_chunks.py:1-17
Timestamp: 2024-10-07T11:20:44.876Z
Learning: The `query_chunks` function in `cognee/tasks/chunking/query_chunks.py` is used within the `search` function in `cognee/api/v1/search/search_v2.py`.
Applied to files:
cognee/api/v1/cognify/cognify.py
🔇 Additional comments (6)
notebooks/cognee_demo.ipynb (1)
591-597: No action required—notebook changes are correct.Verification confirms that no source code references to
check_permissions_on_datasetexist in the notebook cells. The only mentions of this function appear in old execution logs (output from previous runs), which will no longer be generated once the function is removed from the source code. The change toexecution_count: nullis appropriate notebook cleanup for fresh execution.examples/python/simple_example.py (1)
35-41: LGTM! Documentation accurately reflects the updated pipeline.The process step descriptions have been correctly updated to reflect the removal of the permissions check task. The steps now accurately describe the cognify pipeline flow.
cognee/eval_framework/corpus_builder/task_getters/get_default_tasks_by_indices.py (2)
33-34: LGTM! Base task selection correctly updated.The indices have been correctly adjusted from
[0, 1, 2]to[0, 1]to reflect the removal of the permissions check task. The comment accurately describes the selected tasks (classify and extract_chunks).
54-55: LGTM! Consistent with get_no_summary_tasks.The base task selection has been updated consistently with the changes in
get_no_summary_tasks, correctly selecting only the classify and extract_chunks tasks.cognee/api/v1/cognify/cognify.py (2)
21-24: LGTM! Import correctly removed.The removal of
check_permissions_on_datasetfrom the imports is correct and aligns with the deletion of the corresponding module.
80-85: LGTM! Docstring accurately reflects the updated pipeline.The processing pipeline description has been correctly updated to remove the permissions check step and accurately describes the current cognify workflow.
|
Hi! I’ve completed the removal of the check_permissions_on_dataset task and all related imports and usages. Thanks for reviewing! |
|
@dexters1 please review |
borisarzentar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Let's remove that one unused arg and it is perfect!
| Args: | ||
| user (User, optional): The user requesting task execution, used for permission checks. | ||
| user (User, optional): The user requesting task execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove user argument if we don't need it.
Description
This PR removes the obsolete
check_permissions_on_datasettask and all its related imports and usages across the codebase.The authorization logic is now handled earlier in the pipeline, so this task is no longer needed.
These changes simplify the default Cognify pipeline and make the code cleaner and easier to maintain.
Changes Made
cognee/tasks/documents/check_permissions_on_dataset.pycognee/tasks/documents/__init__.pycognee/api/v1/cognify/cognify.pycognee/eval_framework/corpus_builder/task_getters/get_cascade_graph_tasks.pycognee/eval_framework/corpus_builder/task_getters/get_default_tasks_by_indices.py(index positions changed)notebooks/cognee_demo.ipynbexamples/python/simple_example.py(process description)Type of Change
Pre-submission Checklist
DCO Affirmation
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.