Skip to content

Conversation

@erichare
Copy link
Collaborator

@erichare erichare commented Aug 20, 2025

This pull request updates the knowledge base API and related components to support user-specific knowledge bases and authentication using Langflow API keys. The changes ensure that knowledge bases are isolated per user, and access or modification operations require user authentication. The ingestion and retrieval components are refactored to work asynchronously and use the current user's context for all operations.

User Authentication and Knowledge Base Isolation

  • All API routes in knowledge_bases.py now require a CurrentActiveUser parameter and operate on knowledge bases scoped to the authenticated user's directory. [1] [2] [3] [4] [5]

Component Refactoring for Async and User Context

  • Refactored kb_ingest.py and kb_retrieval.py components to use asynchronous methods for knowledge base operations, and to retrieve the current user either from a provided Langflow API key or by creating a new user token if not supplied. [1] [2] [3] [4]
  • All knowledge base directories and operations now use the authenticated user's username as a directory prefix, ensuring user-level isolation. [1] [2] [3] [4]

API Key Handling and Variable Support

  • Added support for passing a Langflow API key to both ingestion and retrieval components, and resolving API keys via variable service when needed. [1] [2] [3]
  • Changed the default for loading provider API keys from the database to False for security reasons.

Dynamic Option Refresh and Error Handling

  • Knowledge base options for dropdowns in components are now dynamically refreshed and retrieved asynchronously per user. [1] [2]
  • Improved error handling to provide clear feedback when a user or API key is invalid or missing. [1] [2]

Internal Codebase Improvements

  • Refactored multiple methods to be asynchronous, updated imports, and improved directory creation logic to support user isolation and robust error handling. [1] [2] [3]

These changes together ensure that knowledge base data is securely separated by user, and that all access and modification is properly authenticated and tracked.

Summary by CodeRabbit

  • New Features

    • Per-user knowledge bases: listing, retrieval, and deletion now show only your own KBs.
    • Added “Langflow API Key” input to ingestion and retrieval components for authentication.
    • Dynamic, per-user KB options in the UI.
    • API keys for embeddings are now encrypted in KB metadata.
    • Clearer errors for missing/invalid users, KB names, or embedding metadata.
  • Refactor

    • Ingestion and retrieval flows updated to asynchronous operations for smoother, more responsive KB creation and access.

@github-actions github-actions bot added the enhancement New feature or request label Aug 20, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 20, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 20, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

User-scoped knowledge base paths were introduced across API, components, and starter templates. Ingestion and retrieval components were converted to async, added per-user resolution via Langflow API keys/DB, and updated metadata handling with encryption. Tests and templates were updated to reflect async flows, per-user directories, and new inputs.

Changes

Cohort / File(s) Summary
API: user-scoped KBs
src/backend/base/langflow/api/v1/knowledge_bases.py
All endpoints accept CurrentActiveUser and resolve KB paths under kb_root_path/. List/get/delete (single and bulk) operate within the per-user directory; 404s when not found.
Component: KB Ingestion (async, per-user, key handling)
src/backend/base/langflow/components/data/kb_ingest.py
Converted to async (build_kb_info, helpers). Added per-user KB pathing, current-user resolution, variable-based API key retrieval, encryption/decryption of keys, async update_build_config, and vector store creation under user paths. Added langflow_api_key input; changed API key load_from_db=False.
Component: KB Retrieval (async, per-user)
src/backend/base/langflow/components/data/kb_retrieval.py
Async methods (_get_current_user, _get_knowledge_bases, update_build_config, get_chroma_kb_data). Per-user KB discovery and kb_path construction. Metadata decryption, explicit errors for missing embedding metadata. Added langflow_api_key input handling.
Starter template: Knowledge Ingestion
src/backend/base/langflow/initial_setup/starter_projects/Knowledge Ingestion.json
Replaced component code with async, per-user, encrypted key logic; added langflow_api_key field; adjusted API key sourcing and metadata handling; updated code hash.
Starter template: Knowledge Retrieval
src/backend/base/langflow/initial_setup/starter_projects/Knowledge Retrieval.json
Updated to async per-user retrieval flow, dynamic KB options, added langflow_api_key field, and updated code hash.
Tests: KB Ingestion
src/backend/tests/unit/components/data/test_kb_ingest.py
Converted tests to async; updated path setup to nested langflow/; awaited new async methods; maintained mocks/patches accordingly.
Tests: KB Retrieval
src/backend/tests/unit/components/data/test_kb_retrieval.py
Converted tests to async; updated paths to langflow/; adjusted metadata and retrieval tests for async behavior and new layout.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant API as API (KB v1)
  participant Auth as Auth/DB
  participant FS as KB Storage (per-user)
  Note over API,FS: New per-user KB scoping

  U->>API: List/Get/Delete KBs
  API->>Auth: Resolve CurrentActiveUser
  Auth-->>API: username
  API->>FS: Access kb_root/username/[kb_name]
  alt Found
    FS-->>API: KB entries / OK
    API-->>U: 200 (scoped result)
  else Not found
    API-->>U: 404
  end
Loading
sequenceDiagram
  autonumber
  participant UI as Ingestion/ Retrieval Component
  participant DB as DB Session
  participant Var as Variable Service
  participant Enc as Settings (Encrypt/Decrypt)
  participant FS as KB Storage (kb_root/username)
  Note over UI,FS: Async per-user flow with API key handling

  UI->>DB: _get_current_user()
  DB-->>UI: username or error
  UI->>Var: Resolve embedding API key (optional)
  Var-->>UI: api_key or None
  UI->>Enc: Encrypt (ingest) / Decrypt (retrieve)
  Enc-->>UI: encrypted/decrypted key
  UI->>FS: Write/Read KB under kb_root/username
  FS-->>UI: Vector store / Documents
  UI-->>UI: Build/Query embeddings
  UI-->>UI: Return Data/DataFrame
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Possibly related PRs

Suggested labels

feature, refactor, size:XXL, lgtm

Suggested reviewers

  • edwinjosechittilappilly
  • carlosrcoelho
  • ogabrielluiz
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat-kb-enhancements

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 20, 2025
coderabbitai[bot]

This comment was marked as outdated.

@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 20, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 20, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 20, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 20, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 20, 2025
@github-actions github-actions bot removed the enhancement New feature or request label Aug 20, 2025
@erichare erichare enabled auto-merge August 21, 2025 22:17
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 21, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 21, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 21, 2025
@codecov
Copy link

codecov bot commented Aug 21, 2025

Codecov Report

❌ Patch coverage is 54.65839% with 73 lines in your changes missing coverage. Please review.
✅ Project coverage is 33.96%. Comparing base (2475e5a) to head (f15a9b1).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...backend/base/langflow/components/data/kb_ingest.py 63.15% 28 Missing ⚠️
...d/base/langflow/components/agents/mcp_component.py 9.09% 20 Missing ⚠️
...rc/backend/base/langflow/api/v1/knowledge_bases.py 31.25% 11 Missing ⚠️
src/backend/base/langflow/base/data/kb_utils.py 71.42% 6 Missing ⚠️
...kend/base/langflow/components/data/kb_retrieval.py 78.94% 4 Missing ⚠️
...d/base/langflow/components/processing/save_file.py 33.33% 4 Missing ⚠️

❌ Your project status has failed because the head coverage (3.79%) is below the target coverage (10.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #9458      +/-   ##
==========================================
+ Coverage   33.69%   33.96%   +0.26%     
==========================================
  Files        1219     1195      -24     
  Lines       57613    55823    -1790     
  Branches     5370     5370              
==========================================
- Hits        19411    18958     -453     
+ Misses      38132    36795    -1337     
  Partials       70       70              
Flag Coverage Δ
backend 56.83% <54.65%> (+1.68%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/backend/base/langflow/api/v2/files.py 42.62% <100.00%> (ø)
...kend/base/langflow/components/data/kb_retrieval.py 74.59% <78.94%> (-1.94%) ⬇️
...d/base/langflow/components/processing/save_file.py 25.00% <33.33%> (-0.38%) ⬇️
src/backend/base/langflow/base/data/kb_utils.py 90.76% <71.42%> (-9.24%) ⬇️
...rc/backend/base/langflow/api/v1/knowledge_bases.py 17.37% <31.25%> (-0.09%) ⬇️
...d/base/langflow/components/agents/mcp_component.py 18.67% <9.09%> (-0.25%) ⬇️
...backend/base/langflow/components/data/kb_ingest.py 75.52% <63.15%> (-5.00%) ⬇️

... and 31 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 21, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 21, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 21, 2025
@sonarqubecloud
Copy link

@erichare erichare added this pull request to the merge queue Aug 21, 2025
Merged via the queue into main with commit 59937ee Aug 21, 2025
131 of 134 checks passed
@erichare erichare deleted the feat-kb-enhancements branch August 21, 2025 23:45
lucaseduoli pushed a commit that referenced this pull request Aug 22, 2025
* feat: Make knowledge bases user-stored

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Fix ruff error

* [autofix.ci] apply automated fixes

* Reuse code

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Don't show options by default

* [autofix.ci] apply automated fixes

* Pass in the Langflow API key if set

* [autofix.ci] apply automated fixes

* Update files.py

* [autofix.ci] apply automated fixes

* Properly handle secret retrieval

* [autofix.ci] apply automated fixes

* Update src/backend/base/langflow/base/data/kb_utils.py

Co-authored-by: Gabriel Luiz Freitas Almeida <[email protected]>

* Update src/backend/base/langflow/base/data/kb_utils.py

Co-authored-by: Gabriel Luiz Freitas Almeida <[email protected]>

* Update src/backend/base/langflow/components/data/kb_ingest.py

Co-authored-by: Gabriel Luiz Freitas Almeida <[email protected]>

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Feedback from review

* [autofix.ci] apply automated fixes

* Fix other uses of incorrect user

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Feedback from review 2

* [autofix.ci] apply automated fixes

* Update kb_ingest.py

* [autofix.ci] apply automated fixes

* Update tests

* [autofix.ci] apply automated fixes

* Update kb_ingest.py

* [autofix.ci] apply automated fixes

* Fix mypy issues

* [autofix.ci] apply automated fixes

* Update kb_utils.py

* Update test_kb_ingest.py

* Fix tests

* [autofix.ci] apply automated fixes

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Gabriel Luiz Freitas Almeida <[email protected]>
lucaseduoli pushed a commit that referenced this pull request Aug 25, 2025
* feat: Make knowledge bases user-stored

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Fix ruff error

* [autofix.ci] apply automated fixes

* Reuse code

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Don't show options by default

* [autofix.ci] apply automated fixes

* Pass in the Langflow API key if set

* [autofix.ci] apply automated fixes

* Update files.py

* [autofix.ci] apply automated fixes

* Properly handle secret retrieval

* [autofix.ci] apply automated fixes

* Update src/backend/base/langflow/base/data/kb_utils.py

Co-authored-by: Gabriel Luiz Freitas Almeida <[email protected]>

* Update src/backend/base/langflow/base/data/kb_utils.py

Co-authored-by: Gabriel Luiz Freitas Almeida <[email protected]>

* Update src/backend/base/langflow/components/data/kb_ingest.py

Co-authored-by: Gabriel Luiz Freitas Almeida <[email protected]>

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Feedback from review

* [autofix.ci] apply automated fixes

* Fix other uses of incorrect user

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Feedback from review 2

* [autofix.ci] apply automated fixes

* Update kb_ingest.py

* [autofix.ci] apply automated fixes

* Update tests

* [autofix.ci] apply automated fixes

* Update kb_ingest.py

* [autofix.ci] apply automated fixes

* Fix mypy issues

* [autofix.ci] apply automated fixes

* Update kb_utils.py

* Update test_kb_ingest.py

* Fix tests

* [autofix.ci] apply automated fixes

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Gabriel Luiz Freitas Almeida <[email protected]>
@MustafaDUT
Copy link

Super, this is what I was waiting for. Now, the files I send with session_id will be stored specifically for that session, right? Not in a global pool, but in that session’s own pool? @erichare

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lgtm This PR has been approved by a maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants