Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix: run lint error
  • Loading branch information
borisarzentar committed Feb 19, 2025
commit 8530862465a8c9caa3f6e3c3270f9c24e6c8fa48
7 changes: 6 additions & 1 deletion cognee/modules/pipelines/operations/run_tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -269,7 +269,12 @@ async def run_tasks_with_telemetry(tasks: list[Task], data, pipeline_name: str):
raise error


async def run_tasks(tasks: list[Task], dataset_id: UUID = uuid4(), data: Any = None, pipeline_name: str = "unknown_pipeline"):
async def run_tasks(
tasks: list[Task],
dataset_id: UUID = uuid4(),
data: Any = None,
pipeline_name: str = "unknown_pipeline",
):
Comment on lines +272 to +277
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix the default value for dataset_id to avoid shared UUIDs.

While adding default parameters aligns with the PR objectives and improves usability, there's a potential issue with the dataset_id default value. Using uuid4() directly will evaluate at module import time, causing all calls without an explicit dataset_id to share the same UUID.

Apply this diff to use a factory pattern instead:

 async def run_tasks(
     tasks: list[Task],
-    dataset_id: UUID = uuid4(),
+    dataset_id: UUID | None = None,
     data: Any = None,
     pipeline_name: str = "unknown_pipeline",
 ):
+    if dataset_id is None:
+        dataset_id = uuid4()

This ensures that each call gets a unique UUID when dataset_id is not provided.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def run_tasks(
tasks: list[Task],
dataset_id: UUID = uuid4(),
data: Any = None,
pipeline_name: str = "unknown_pipeline",
):
async def run_tasks(
tasks: list[Task],
dataset_id: UUID | None = None,
data: Any = None,
pipeline_name: str = "unknown_pipeline",
):
if dataset_id is None:
dataset_id = uuid4()
# ... (rest of the function body)

pipeline_id = uuid5(NAMESPACE_OID, pipeline_name)

pipeline_run = await log_pipeline_run_start(pipeline_id, dataset_id, data)
Expand Down
Loading