Skip to content
Merged
Changes from 24 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
39c2106
Added basic profiling
Vasilije1990 Dec 4, 2024
a935940
Added basic profiling
Vasilije1990 Dec 4, 2024
7e66d50
Added basic profiling
Vasilije1990 Dec 4, 2024
df7bbfe
Added basic profiling
Vasilije1990 Dec 4, 2024
d589255
Added basic profiling
Vasilije1990 Dec 4, 2024
94688ed
Added basic profiling
Vasilije1990 Dec 4, 2024
6171cd7
Added basic profiling
Vasilije1990 Dec 4, 2024
bba32aa
Added basic profiling
Vasilije1990 Dec 4, 2024
a904b8d
Added basic profiling
Vasilije1990 Dec 4, 2024
bdef152
Added basic profiling
Vasilije1990 Dec 4, 2024
e2539cd
Added basic profiling
Vasilije1990 Dec 4, 2024
6ab427e
Added basic profiling
Vasilije1990 Dec 4, 2024
fa60827
Added basic profiling
Vasilije1990 Dec 4, 2024
32ca751
Added basic profiling
Vasilije1990 Dec 4, 2024
d523f71
Added basic profiling
Vasilije1990 Dec 4, 2024
c2896f3
Added basic profiling
Vasilije1990 Dec 4, 2024
e178d38
Added basic profiling
Vasilije1990 Dec 4, 2024
21c7b8e
Bump release version
Vasilije1990 Dec 4, 2024
54b8844
Bump release version
Vasilije1990 Dec 4, 2024
692770c
Bump release version
Vasilije1990 Dec 4, 2024
2df1eb6
Bump release version
Vasilije1990 Dec 4, 2024
8d1936f
Bump release version
Vasilije1990 Dec 4, 2024
f37d96d
Bump release version
Vasilije1990 Dec 4, 2024
cc43a8c
Bump release version
Vasilije1990 Dec 4, 2024
cf51555
Bump release version
Vasilije1990 Dec 4, 2024
d2fccc1
Bump release version
Vasilije1990 Dec 4, 2024
a96daef
Bump release version
Vasilije1990 Dec 4, 2024
316f2f3
Merge branch 'main' into COG-698
Vasilije1990 Dec 5, 2024
0268df2
Merge branch 'main' into COG-698
borisarzentar Dec 6, 2024
ea879b2
Merge branch 'main' into COG-698
borisarzentar Dec 8, 2024
b397f9e
Merge branch 'main' into COG-698
borisarzentar Dec 9, 2024
c431e7c
Update profiling.yaml
Vasilije1990 Dec 10, 2024
5609bbc
removed issues
Vasilije1990 Dec 11, 2024
0e68019
fix
Vasilije1990 Dec 11, 2024
0f0e34e
Merge branch 'main' into COG-698
Vasilije1990 Dec 11, 2024
7cc5607
fix
Vasilije1990 Dec 11, 2024
4dac950
Merge remote-tracking branch 'origin/COG-698' into COG-698
Vasilije1990 Dec 11, 2024
9a9ea75
Fix neo4j
Vasilije1990 Dec 11, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
129 changes: 129 additions & 0 deletions .github/workflows/profiling.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
name: Profiling Comparison for Specific File 2

on:
pull_request_target:
types:
- opened
- reopened
- synchronize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical Security Risk: Unsafe usage of pull_request_target

The current trigger configuration using pull_request_target poses a significant security risk as it:

  1. Runs with repository secrets and permissions
  2. Checks out and executes untrusted code from pull requests
  3. Lacks path filters to limit scope

Implement these security measures:

 on:
-  pull_request_target:
+  pull_request:
     types:
       - opened
       - reopened
       - synchronize
+    paths:
+      - 'cognee/api/v1/cognify/code_graph_pipeline.py'

If pull_request_target is absolutely necessary for token permissions:

  1. Never checkout or run code from the PR
  2. Use explicit commit SHA references
  3. Implement additional security validations
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
on:
pull_request_target:
types:
- opened
- reopened
- synchronize
on:
pull_request:
types:
- opened
- reopened
- synchronize
paths:
- 'cognee/api/v1/cognify/code_graph_pipeline.py'

jobs:
profiler:
runs-on: ubuntu-latest

steps:
# Checkout the code from the repository with full history
- name: Checkout code
uses: actions/checkout@v3
Copy link
Contributor

@coderabbitai coderabbitai bot Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update GitHub Actions to latest versions

Several actions are using outdated versions which may have security or compatibility issues:

  • actions/checkout@v3 → v4
  • actions/setup-python@v4 → v5
  • actions/github-script@v6 → v7
-      uses: actions/checkout@v3
+      uses: actions/checkout@v4
-      uses: actions/setup-python@v4
+      uses: actions/setup-python@v5
-      uses: actions/github-script@v6
+      uses: actions/github-script@v7

Also applies to: 18-18, 91-91

🧰 Tools
🪛 actionlint (1.7.4)

12-12: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

Copy link
Contributor

@borisarzentar borisarzentar Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to actions/checkout@v4

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borisarzentar, to update the GitHub Actions to their latest versions, you can modify the workflow file as follows:

  • Change uses: actions/checkout@v3 to uses: actions/checkout@v4
  • Change uses: actions/setup-python@v4 to uses: actions/setup-python@v5
  • Change uses: actions/github-script@v6 to uses: actions/github-script@v7

This will ensure your workflow uses the latest versions of these actions.

with:
fetch-depth: 0 # Fetch all history so we can checkout any commit

- name: Check if the sender is a maintainer
id: check_permissions
uses: actions/github-script@v6
with:
script: |
const sender = context.payload.sender.login;
const { data: membership } = await github.rest.orgs.getMembershipForUser({
org: context.repo.owner,
username: sender,
}).catch(() => ({ data: { role: null } }));
return membership.role;
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

- name: Set is_maintainer variable
run: |
echo "is_maintainer=${{ steps.check_permissions.outputs.result == 'admin' || steps.check_permissions.outputs.result == 'maintainer' }}" >> $GITHUB_ENV

- name: Stop if not a maintainer
if: env.is_maintainer != 'true'
run: |
echo "User ${{ github.event.sender.login }} is not a maintainer. Exiting."
exit 0 # Use exit 0 to mark the job as successful but stop execution
Comment on lines +30 to +38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve shell script security and error handling

The maintainer check implementation has potential security issues:

 - name: Set is_maintainer variable
   run: |
-    echo "is_maintainer=${{ steps.check_permissions.outputs.result == 'admin' || steps.check_permissions.outputs.result == 'maintainer' }}" >> $GITHUB_ENV
+    is_maintainer="${{ steps.check_permissions.outputs.result == 'admin' || steps.check_permissions.outputs.result == 'maintainer' }}"
+    echo "is_maintainer=${is_maintainer}" >> "$GITHUB_ENV"

 - name: Stop if not a maintainer
   if: env.is_maintainer != 'true'
   run: |
+    echo "::warning::Profiling skipped - User ${{ github.event.sender.login }} is not a maintainer"
     echo "User ${{ github.event.sender.login }} is not a maintainer. Exiting."
     exit 0
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Set is_maintainer variable
run: |
echo "is_maintainer=${{ steps.check_permissions.outputs.result == 'admin' || steps.check_permissions.outputs.result == 'maintainer' }}" >> $GITHUB_ENV
- name: Stop if not a maintainer
if: env.is_maintainer != 'true'
run: |
echo "User ${{ github.event.sender.login }} is not a maintainer. Exiting."
exit 0 # Use exit 0 to mark the job as successful but stop execution
- name: Set is_maintainer variable
run: |
is_maintainer="${{ steps.check_permissions.outputs.result == 'admin' || steps.check_permissions.outputs.result == 'maintainer' }}"
echo "is_maintainer=${is_maintainer}" >> "$GITHUB_ENV"
- name: Stop if not a maintainer
if: env.is_maintainer != 'true'
run: |
echo "::warning::Profiling skipped - User ${{ github.event.sender.login }} is not a maintainer"
echo "User ${{ github.event.sender.login }} is not a maintainer. Exiting."
exit 0 # Use exit 0 to mark the job as successful but stop execution
🧰 Tools
🪛 actionlint (1.7.4)

35-35: shellcheck reported issue in this script: SC2086:info:1:140: Double quote to prevent globbing and word splitting

(shellcheck)


# Set up Python environment
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'

- name: Install Poetry
uses: snok/install-poetry@v1.3.2
with:
virtualenvs-create: true
virtualenvs-in-project: true
installer-parallel: true

- name: Install dependencies
run: |
poetry install --no-interaction --all-extras
poetry run pip install pyinstrument
Comment on lines +46 to +56
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve dependency management security and efficiency

The current setup could be more secure and efficient:

 - name: Install Poetry
   uses: snok/install-poetry@v1.3.2
   with:
     virtualenvs-create: true
     virtualenvs-in-project: true
     installer-parallel: true
+    version: 1.7.1  # Pin to specific version

 - name: Install dependencies
   run: |
+    # Verify poetry.lock is in sync
+    poetry lock --check
-    poetry install --no-interaction --all-extras
+    # Install only needed dependencies
+    poetry install --no-interaction --only main,dev
     poetry run pip install pyinstrument
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Install Poetry
uses: snok/install-poetry@v1.3.2
with:
virtualenvs-create: true
virtualenvs-in-project: true
installer-parallel: true
- name: Install dependencies
run: |
poetry install --no-interaction --all-extras
poetry run pip install pyinstrument
- name: Install Poetry
uses: snok/install-poetry@v1.3.2
with:
virtualenvs-create: true
virtualenvs-in-project: true
installer-parallel: true
version: 1.7.1 # Pin to specific version
- name: Install dependencies
run: |
# Verify poetry.lock is in sync
poetry lock --check
# Install only needed dependencies
poetry install --no-interaction --only main,dev
poetry run pip install pyinstrument



# Set environment variables for SHAs
- name: Set environment variables
run: |
echo "BASE_SHA=${{ github.event.pull_request.base.sha }}" >> $GITHUB_ENV
echo "HEAD_SHA=${{ github.event.pull_request.head.sha }}" >> $GITHUB_ENV

Comment on lines +60 to +64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve environment variables setup

The current setup lacks validation and has potential shell scripting issues.

     - name: Set environment variables
       run: |
+        if [[ -z "${{ github.event.pull_request }}" ]]; then
+          echo "Error: No pull request context found"
+          exit 1
+        fi
-        echo "BASE_SHA=${{ github.event.pull_request.base.sha }}" >> $GITHUB_ENV
-        echo "HEAD_SHA=${{ github.event.pull_request.head.sha }}" >> $GITHUB_ENV
+        echo "BASE_SHA=${{ github.event.pull_request.base.sha }}" >> "$GITHUB_ENV"
+        echo "HEAD_SHA=${{ github.event.pull_request.head.sha }}" >> "$GITHUB_ENV"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Set environment variables
run: |
echo "BASE_SHA=${{ github.event.pull_request.base.sha }}" >> $GITHUB_ENV
echo "HEAD_SHA=${{ github.event.pull_request.head.sha }}" >> $GITHUB_ENV
- name: Set environment variables
run: |
if [[ -z "${{ github.event.pull_request }}" ]]; then
echo "Error: No pull request context found"
exit 1
fi
echo "BASE_SHA=${{ github.event.pull_request.base.sha }}" >> "$GITHUB_ENV"
echo "HEAD_SHA=${{ github.event.pull_request.head.sha }}" >> "$GITHUB_ENV"
🧰 Tools
🪛 actionlint (1.7.4)

37-37: shellcheck reported issue in this script: SC2086:info:1:62: Double quote to prevent globbing and word splitting

(shellcheck)


37-37: shellcheck reported issue in this script: SC2086:info:2:62: Double quote to prevent globbing and word splitting

(shellcheck)

# Run profiler on the base branch
- name: Run profiler on base branch
env:
BASE_SHA: ${{ env.BASE_SHA }}
run: |
echo "Profiling the base branch for code_graph_pipeline.py"
echo "Checking out base SHA: $BASE_SHA"
git checkout $BASE_SHA
echo "This is the working directory: $PWD"
# Ensure the script is executable
chmod +x cognee/api/v1/cognify/code_graph_pipeline.py
# Run Scalene
poetry run pyinstrument --renderer json -o base_results.json cognee/api/v1/cognify/code_graph_pipeline.py

# Run profiler on head branch
- name: Run profiler on head branch
env:
HEAD_SHA: ${{ env.HEAD_SHA }}
run: |
echo "Profiling the head branch for code_graph_pipeline.py"
echo "Checking out head SHA: $HEAD_SHA"
git checkout $HEAD_SHA
echo "This is the working directory: $PWD"
# Ensure the script is executable
chmod +x cognee/api/v1/cognify/code_graph_pipeline.py
# Run Scalene
poetry run pyinstrument --renderer json -o head_results.json cognee/api/v1/cognify/code_graph_pipeline.py

# Compare profiling results
- name: Compare profiling results
run: |
python -c '
import json
try:
with open("base_results.json") as f:
base = json.load(f)
with open("head_results.json") as f:
head = json.load(f)
cpu_diff = head.get("total_cpu_samples_python", 0) - base.get("total_cpu_samples_python", 0)
memory_diff = head.get("malloc_samples", 0) - base.get("malloc_samples", 0)
with open("profiling_diff.txt", "w") as f:
f.write(f"CPU Usage Difference: {cpu_diff}\\n")
f.write(f"Memory Usage Difference: {memory_diff} bytes\\n")
except Exception as e:
with open("profiling_diff.txt", "w") as f:
f.write(f"Error comparing profiling results: {e}\\n")
'

# Post results to the pull request
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can upload this diff result as artifact -> https://github.com/actions/upload-artifact

- name: Post profiling results to PR
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const diff = fs.readFileSync('profiling_diff.txt', 'utf-8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `### Profiling Results for code_graph_pipeline.py\n\`\`\`\n${diff || 'No differences found.'}\n\`\`\``
});
Loading