feat: adds Component Inputs telemetry #10254

ogabrielluiz · 2025-10-13T20:05:16Z

Summary by CodeRabbit

New Features
- Enhanced telemetry with per-component run IDs and input logging, including automatic chunking for large payloads.
- Added opt-in input tracking with safeguards that exclude sensitive fields by default.
Chores
- Updated starter project dependencies (e.g., FastAPI 0.119.0, LangChain Core 0.3.79, plus select ecosystem bumps like boto3 and scrapegraph_py) for improved compatibility.
Tests
- Added integration and unit tests covering telemetry input tracking, payload splitting, and serialization to ensure reliability.

Added a new set of field types that should not be tracked in telemetry due to their sensitive nature, including PASSWORD, AUTH, FILE, CONNECTION, and MCP. Updated relevant input classes to ensure telemetry tracking is disabled for these sensitive fields, enhancing data privacy and security.

…tion support Added new fields to the ComponentPayload and ComponentInputsPayload classes, including component_id and component_run_id, to improve telemetry data tracking. Introduced a serialize_input_values function to handle JSON serialization of component input values, ensuring robust handling of input data for telemetry purposes.

Added functionality to track and cache telemetry input values within the Component class. Introduced a method to determine if inputs should be tracked based on sensitivity and an accessor for retrieving cached telemetry data, enhancing the robustness of telemetry handling.

Introduced a new method, log_package_component_inputs, to the TelemetryService for logging telemetry data related to component inputs. This enhancement improves the tracking capabilities of the telemetry system, allowing for more detailed insights into component interactions.

Added functionality to log component input telemetry both during successful execution and error cases. Introduced a unique component_run_id for each execution to improve tracking. This update ensures comprehensive telemetry data collection, enhancing the robustness of the telemetry system.

Added tests for the new component_id and component_run_id fields in ComponentPayload and ComponentInputsPayload classes. Introduced a new test suite for ComponentInputTelemetry, covering serialization of various data types and handling of edge cases. This update improves the robustness and coverage of telemetry data handling in the system.

Changed the default value of track_in_telemetry from True to False in the BaseInputMixin class. Updated documentation to clarify that telemetry tracking is now opt-in and can be explicitly enabled for individual input types, enhancing data privacy and control.

Modified the default value of `track_in_telemetry` for various input classes to enhance data privacy. Regular inputs now default to False, while safe inputs like `IntInput` and `BoolInput` default to True, ensuring explicit opt-in for telemetry tracking. Updated related tests to reflect these changes.

This commit adds two new optional fields to ComponentInputsPayload: - chunk_index: Index of this chunk in a split payload sequence - total_chunks: Total number of chunks in the split sequence Both fields default to None and use camelCase aliases for serialization. This is Task 1 of the telemetry query parameter splitting implementation. Tests included: - Verify fields exist and can be set - Verify camelCase serialization aliases work correctly - Verify fields default to None when not provided Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

…g of oversized inputs This commit enhances the ComponentInputsPayload class by implementing functionality to automatically split input values into multiple chunks if they exceed the maximum URL size limit. Key changes include: - Added methods for calculating URL size, truncating oversized values, and splitting payloads. - Updated component_inputs field to accept a dictionary instead of a string for better handling of input values. - Improved documentation for the ComponentInputsPayload class to reflect the new splitting behavior and usage examples. These changes aim to improve telemetry data handling and ensure compliance with URL length restrictions.

…yloads This commit updates the log_package_component_inputs method in the TelemetryService class to split component input payloads into multiple requests if they exceed the maximum URL size limit. Key changes include: - Added logic to split the payload using the new split_if_needed method. - Each chunk is queued separately for telemetry logging. These improvements ensure better handling of telemetry data while adhering to URL length restrictions.

This commit introduces a centralized constant, MAX_TELEMETRY_URL_SIZE, to define the maximum URL length for telemetry GET requests. Key changes include: - Added MAX_TELEMETRY_URL_SIZE constant to schema.py for better maintainability. - Updated split_if_needed method in ComponentInputsPayload to use the new constant instead of a hardcoded value. - Adjusted the TelemetryService to reference the centralized constant for URL size limits. These changes enhance code clarity and ensure consistent handling of URL size limits across the telemetry service.

This commit modifies the tests for ComponentInputsPayload to utilize a dictionary for component inputs instead of a serialized JSON string. Key changes include: - Renamed the test method to reflect the new input type. - Removed unnecessary serialization steps and assertions related to JSON strings. - Added assertions to verify the correct handling of dictionary inputs. These changes streamline the testing process and improve clarity in how component inputs are represented.

This commit introduces integration tests for the TelemetryService to verify its handling of large and small payloads. Key changes include: - Added tests to ensure large payloads are split into multiple chunks and queued correctly. - Implemented a test to confirm that small payloads are not split and result in a single queued event. - Created a mock settings service for testing purposes. These tests enhance the reliability of the telemetry service by ensuring proper payload management.

This commit expands the test suite for ComponentInputsPayload by adding various scenarios to ensure robust handling of input payloads. Key changes include: - Introduced tests for calculating URL size, ensuring it returns a positive integer and accounts for encoding. - Added tests to verify the splitting logic for large payloads, including checks for chunk metadata and preservation of fixed fields. - Implemented property-based tests using Hypothesis to validate that all chunks respect the maximum URL size and preserve original data. These enhancements improve the reliability and coverage of the ComponentInputsPayload tests, ensuring proper functionality under various conditions.

coderabbitai · 2025-10-13T20:06:03Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds per-component telemetry inputs collection, splitting, and dispatch. Extends telemetry schema with ComponentInputsPayload and new fields on ComponentPayload; updates service to queue chunked input events; integrates into component build flow. Introduces telemetry opt-in flags on LFX inputs and caches serialized inputs in custom components. Updates multiple starter project dependency versions. Adjusts tests and secrets baseline.

Changes

Cohort / File(s)	Summary
Telemetry schema and service `src/backend/base/langflow/services/telemetry/schema.py`, `src/backend/base/langflow/services/telemetry/service.py`	Adds MAX_TELEMETRY_URL_SIZE, extends ComponentPayload with component_id/component_run_id, introduces ComponentInputsPayload with URL-size-aware splitting and truncation, and adds TelemetryService.log_package_component_inputs to queue chunked "component_inputs" events.
Build flow integration `src/backend/base/langflow/api/build.py`	Generates component_run_id per component, emits ComponentInputsPayload (when inputs available) and ComponentPayload (success/error), includes component_id and component_run_id; ensures telemetry dispatch in both success and error paths.
Tests: telemetry payloads and splitting `src/backend/tests/integration/test_exception_telemetry.py`, `src/backend/tests/integration/test_telemetry_splitting_integration.py`, `src/backend/tests/unit/services/telemetry/test_component_inputs_splitting.py`	Updates/extends tests for new component_id field and inputs payload; adds integration tests for service-level splitting and unit tests for payload splitting, truncation, and chunk metadata.
LFX inputs and component telemetry `src/lfx/src/lfx/inputs/input_mixin.py`, `src/lfx/src/lfx/inputs/inputs.py`, `src/lfx/src/lfx/custom/custom_component/component.py`	Adds SENSITIVE_FIELD_TYPES and track_in_telemetry flag to inputs; sets defaults per input type; component caches serialized, trackable input values; provides get_telemetry_input_values and _should_track_input.
Starter projects: dependency bumps `src/backend/base/langflow/initial_setup/starter_projects/*`	Updates dependency versions in multiple starter project JSONs (e.g., fastapi 0.118.0→0.119.0; several also bump langchain_core 0.3.78→0.3.79; some bump boto3/scrapegraph_py). No logic changes.
Secrets baseline `.secrets.baseline`	Adds new Secret Keyword entry for `src/lfx/src/lfx/inputs/input_mixin.py` and updates timestamp; mirrors hashed_secret in baseline results.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant BuildAPI as Build API
  participant Component
  participant Telemetry as TelemetryService

  rect rgb(245,248,255)
  note right of BuildAPI: Start component build
  BuildAPI->>BuildAPI: Generate component_run_id
  BuildAPI->>Component: map_inputs()
  Component-->>BuildAPI: trackable inputs (optional)
  alt inputs available
    BuildAPI->>Telemetry: log_package_component_inputs(ComponentInputsPayload)
    Telemetry->>Telemetry: split_if_needed()
    loop for each chunk
      Telemetry->>Telemetry: _queue_event("component_inputs", chunk)
    end
  end
  end

  rect rgb(240,255,245)
  note right of BuildAPI: Execute component
  BuildAPI->>Component: execute()
  Component-->>BuildAPI: result
  BuildAPI->>Telemetry: _queue_event("component", ComponentPayload{success=true, componentId, componentRunId})
  end

  rect rgb(255,245,245)
  note right of BuildAPI: Error path
  BuildAPI--xComponent: execute()
  BuildAPI->>BuildAPI: generate error run_id
  opt inputs available
    BuildAPI->>Telemetry: log_package_component_inputs(ComponentInputsPayload)
  end
  BuildAPI->>Telemetry: _queue_event("component", ComponentPayload{success=false, error, componentId, componentRunId})
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

feat: Add exception telemetry #9194 — Also modifies telemetry schema/service to add exception payloads and logging; overlaps in the same subsystem and files.

Suggested labels

enhancement, size:XL

Suggested reviewers

jordanrfrazier

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Test Quality And Coverage	⚠️ Warning	I inspected the new and updated tests related to the Component Inputs telemetry feature. The PR adds focused integration tests that exercise splitting vs. non-splitting paths and unit tests that thoroughly validate the splitting algorithm, truncation behavior, URL-size accounting, and chunk metadata, including property-based coverage with Hypothesis—this goes beyond smoke testing and asserts real behavior. Tests use pytest/pytest-asyncio patterns consistent with backend practices, and there are no frontend/API endpoint changes requiring Playwright or HTTP response tests. However, prior review comments flag multiple instances in test_exception_telemetry.py where ComponentInputsPayload is constructed with JSON strings instead of dicts, which would violate the schema and undermine the splitting/serialization behavior; unless corrected in this PR branch, those tests reduce quality and may fail. Overall coverage and depth are strong, but those type mismatches must be fixed to fully meet the check’s criteria.	Update test_exception_telemetry.py to pass dicts for component_inputs wherever ComponentInputsPayload is constructed, per the reviewer’s suggested diffs, and re-run tests to ensure validation and splitting logic are exercised correctly. If any async test awaits a sync method, align the call or mark appropriately; otherwise keep pytest-asyncio usage where actual awaits occur. After these fixes, the test suite should comprehensively and correctly validate the new telemetry functionality.
Test Coverage For New Implementations	❓ Inconclusive	I inspected the PR’s changes and the repository to verify tests for the new telemetry inputs feature. The PR adds substantial new functionality (ComponentInputsPayload, payload splitting, service method to log inputs, schema and build integration, and input tracking on the LFX side). Corresponding tests are present: a new unit test module for splitting logic (backend/tests/unit/services/telemetry/test_component_inputs_splitting.py), a new integration test validating chunked queuing (backend/tests/integration/test_telemetry_splitting_integration.py), and updates to existing integration tests to incorporate component_id and ComponentInputsPayload cases (backend/tests/integration/test_exception_telemetry.py). Test filenames follow the project convention test_*.py and the tests meaningfully exercise the new behavior, including edge cases and property-based checks. Prior review comments flagged a type mismatch (component_inputs should be dicts, not JSON strings); based on the PR summary, tests were updated to use dicts, but without the exact file content I cannot conclusively confirm all instances were corrected.	Please ensure all occurrences in backend/tests/integration/test_exception_telemetry.py pass component_inputs as dicts rather than JSON strings, matching the schema and splitting logic, particularly around the noted line ranges. If those fixes are present, this check would pass; otherwise, apply the suggested diffs from the review comments to align tests with the new payload type and re-run the suite.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title concisely summarizes the main feature introduced by the changeset, namely adding telemetry support for component inputs, and uses clear, specific wording without unnecessary detail.
Docstring Coverage	✅ Passed	Docstring coverage is 87.10% which is sufficient. The required threshold is 80.00%.
Test File Naming And Structure	✅ Passed	I inspected the test files introduced/modified by this PR and found they follow pytest naming and structure: files are under src/backend/tests/integration and src/backend/tests/unit with test_.py names, use pytest fixtures (e.g., mock_settings_service), and contain descriptive test function names covering both positive and negative scenarios (e.g., large vs small payload splitting, error telemetry). Integration tests are clearly placed in the integration directory, and unit tests focus on payload splitting logic including edge cases like truncation and special characters. I did not find frontend Playwright tests (.test.ts/tsx), but this PR appears backend-only, so the frontend requirement is not applicable to these changes. Setup/teardown is handled via fixtures and mocks, which is consistent with pytest best practices. Overall, the backend testing structure and coverage are appropriate for the scope of this PR.
Excessive Mock Usage Warning	✅ Passed	I reviewed the tests added/modified in this PR, focusing on mock usage. The new unit tests for ComponentInputsPayload splitting rely on real logic and data generation without mocks, which is appropriate. The new integration test for telemetry splitting introduces a MagicMock-based settings fixture and monkeypatches TelemetryService._queue_event to capture queued events; this is a reasonable boundary mock for external I/O and does not obscure core logic. In test_exception_telemetry, mocks are used to simulate settings/telemetry emission in error paths, but the assertions target serialization and payload composition rather than purely mocked interactions. Overall, mocks are used sparingly to isolate external dependencies (settings, queueing) while exercising real behavior; there is no excessive or misleading mock usage detected.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (9)

src/backend/base/langflow/services/telemetry/service.py (1)

150-162: Split uses fixed base URL; consider actual base_url to avoid oversize/undersize chunks

Payload splitting relies on ComponentInputsPayload’s internal URL-size calc using a default base URL. If self.base_url differs, chunks may exceed (or underutilize) the limit. Consider passing/using the actual base URL in the size calculation (API change in schema) to keep behavior precise. Otherwise this method looks solid.

Would you verify that the schema’s _calculate_url_size uses the same base URL configured here?
src/lfx/src/lfx/inputs/input_mixin.py (1)
44-51: Good privacy defaults; minor typing nit

SENSITIVE_FIELD_TYPES and the opt‑in track_in_telemetry default are appropriate. Optionally annotate the constant for clearer typing:
-SENSITIVE_FIELD_TYPES = {
+SENSITIVE_FIELD_TYPES: set[FieldTypes] = {
Also applies to: 109-114
src/lfx/src/lfx/custom/custom_component/component.py (2)
532-549: Harden serialization and avoid breaking component initialization

serialize(input_.value) can raise for unusual types. Wrap to prevent init-time failures and skip un-serializable values.
-            if self._should_track_input(input_):
-                telemetry_values[input_.name] = serialize(input_.value)
+            if self._should_track_input(input_):
+                try:
+                    telemetry_values[input_.name] = serialize(input_.value)
+                except Exception:
+                    # Skip values we can't serialize for telemetry
+                    continue
920-945: Keep telemetry cache in sync when inputs change

When set_input_value updates a tracked input, also update (or remove) the cached telemetry entry so callers get current values.
             try:
                 self._inputs[name].value = value
             except Exception as e:
                 msg = f"Error setting input value for {name}: {e}"
                 raise ValueError(msg) from e
+            # Keep telemetry cache in sync
+            try:
+                input_obj = self._inputs[name]
+                if self._should_track_input(input_obj):
+                    serialized = serialize(value)
+                    if self._telemetry_input_values is None:
+                        self._telemetry_input_values = {}
+                    self._telemetry_input_values[name] = serialized
+                elif self._telemetry_input_values:
+                    self._telemetry_input_values.pop(name, None)
+            except Exception:
+                # Best-effort; ignore telemetry update errors
+                pass
             if hasattr(self._inputs[name], "load_from_db"):
                 self._inputs[name].load_from_db = False
src/backend/tests/unit/services/telemetry/test_component_inputs_splitting.py (2)
260-260: Replace magic number 2000 with MAX_TELEMETRY_URL_SIZE

Use the shared constant to avoid drift if the limit changes.

Apply this diff:
-    assert chunk_size <= 2000
+    assert chunk_size <= MAX_TELEMETRY_URL_SIZE
Also applies to: 303-303, 331-331

347-466: Optional: disable Hypothesis deadlines to prevent flakiness

These property tests do repeated httpx/orjson work; add settings(deadline=None) (and optionally tune max_examples) to reduce spurious DeadlineExceeded.

Example:
from hypothesis import settings

@settings(deadline=None)
@given(...)
def test_property_split_never_exceeds_max_size(...):
    ...
src/backend/tests/integration/test_telemetry_splitting_integration.py (1)
39-43: Assert queued path equals "component_inputs"

Strengthen the test by asserting the path value for each queued event.

Apply this diff:
     for event in queued_events:
         assert isinstance(event, tuple)
         assert len(event) == 3
+        assert event[2] == "component_inputs"
src/backend/base/langflow/api/build.py (1)
380-396: Generate component_run_id once and reuse across success/error; avoid deriving name from ID

Move component_run_id creation to the start of _build_vertex and reuse in all branches. This guarantees joinability of inputs/execution even for late failures.

Prefer a stable component name source (e.g., vertex.display_name or class/type) over vertex_id.split("-")[0].

Apply these diffs:
-            # Generate run_id for this component execution
-            component_run_id = str(uuid.uuid4())
+            # component_run_id is generated once per _build_vertex invocation (see note below)
-            # Generate run_id for this component execution (error case)
-            component_run_id = str(uuid.uuid4())
+            # Reuse the same component_run_id generated at function start
And add near the top of _build_vertex (outside the selected ranges):
# At the beginning of _build_vertex (right after locals like flow_id_str/start_time):
component_run_id = str(uuid.uuid4())
# If available, prefer a metadata-driven name, e.g.:
component_name = getattr(vertex, "name", None) or getattr(vertex, "display_name", None) or vertex_id.split("-")[0]
Also applies to: 411-439
src/backend/base/langflow/services/telemetry/schema.py (1)

101-167: Optional: factor out duplicate test‑payload construction in _truncate_value_to_fit

Both branches build nearly identical ComponentInputsPayload test objects. Extract a tiny helper to reduce duplication and speed up binary search loops slightly.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db280e4 and 2594e23.

📒 Files selected for processing (42)

.secrets.baseline (1 hunks)
src/backend/base/langflow/api/build.py (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Basic Prompt Chaining.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Basic Prompting.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Blog Writer.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Custom Component Generator.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Document Q&A.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Financial Report Parser.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Hybrid Search RAG.json (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Image Sentiment Analysis.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Instagram Copywriter.json (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Invoice Summarizer.json (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Knowledge Retrieval.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Market Research.json (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Meeting Summary.json (3 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Memory Chatbot.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/News Aggregator.json (4 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json (3 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Pokédex Agent.json (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Portfolio Website Code Generator.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Price Deal Finder.json (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Research Agent.json (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/SEO Keyword Generator.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/SaaS Pricing.json (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Search agent.json (3 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Sequential Tasks Agents.json (5 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Simple Agent.json (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Social Media Agent.json (4 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Text Sentiment Analysis.json (2 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Travel Planning Agents.json (4 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Twitter Thread Generator.json (1 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Vector Store RAG.json (3 hunks)
src/backend/base/langflow/initial_setup/starter_projects/Youtube Analysis.json (3 hunks)
src/backend/base/langflow/services/telemetry/schema.py (2 hunks)
src/backend/base/langflow/services/telemetry/service.py (2 hunks)
src/backend/tests/integration/test_exception_telemetry.py (5 hunks)
src/backend/tests/integration/test_telemetry_splitting_integration.py (1 hunks)
src/backend/tests/unit/services/telemetry/test_component_inputs_splitting.py (1 hunks)
src/lfx/src/lfx/custom/custom_component/component.py (4 hunks)
src/lfx/src/lfx/inputs/input_mixin.py (3 hunks)
src/lfx/src/lfx/inputs/inputs.py (11 hunks)

🧰 Additional context used

📓 Path-based instructions (6)

{src/backend/**/*.py,tests/**/*.py,Makefile}

📄 CodeRabbit inference engine (.cursor/rules/backend_development.mdc)

{src/backend/**/*.py,tests/**/*.py,Makefile}: Run make format_backend to format Python code before linting or committing changes
Run make lint to perform linting checks on backend Python code

Files:

src/backend/base/langflow/services/telemetry/service.py
src/backend/tests/integration/test_telemetry_splitting_integration.py
src/backend/base/langflow/services/telemetry/schema.py
src/backend/tests/unit/services/telemetry/test_component_inputs_splitting.py
src/backend/tests/integration/test_exception_telemetry.py
src/backend/base/langflow/api/build.py

src/backend/tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/testing.mdc)

src/backend/tests/**/*.py: Unit tests for backend code must be located in the 'src/backend/tests/' directory, with component tests organized by component subdirectory under 'src/backend/tests/unit/components/'.
Test files should use the same filename as the component under test, with an appropriate test prefix or suffix (e.g., 'my_component.py' → 'test_my_component.py').
Use the 'client' fixture (an async httpx.AsyncClient) for API tests in backend Python tests, as defined in 'src/backend/tests/conftest.py'.
When writing component tests, inherit from the appropriate base class in 'src/backend/tests/base.py' (ComponentTestBase, ComponentTestBaseWithClient, or ComponentTestBaseWithoutClient) and provide the required fixtures: 'component_class', 'default_kwargs', and 'file_names_mapping'.
Each test in backend Python test files should have a clear docstring explaining its purpose, and complex setups or mocks should be well-commented.
Test both sync and async code paths in backend Python tests, using '@pytest.mark.asyncio' for async tests.
Mock external dependencies appropriately in backend Python tests to isolate unit tests from external services.
Test error handling and edge cases in backend Python tests, including using 'pytest.raises' and asserting error messages.
Validate input/output behavior and test component initialization and configuration in backend Python tests.
Use the 'no_blockbuster' pytest marker to skip the blockbuster plugin in tests when necessary.
Be aware of ContextVar propagation in async tests; test both direct event loop execution and 'asyncio.to_thread' scenarios to ensure proper context isolation.
Test error handling by mocking internal functions using monkeypatch in backend Python tests.
Test resource cleanup in backend Python tests by using fixtures that ensure proper initialization and cleanup of resources.
Test timeout and performance constraints in backend Python tests using 'asyncio.wait_for' and timing assertions.
Test Langflow's Messag...

Files:

src/backend/tests/integration/test_telemetry_splitting_integration.py
src/backend/tests/unit/services/telemetry/test_component_inputs_splitting.py
src/backend/tests/integration/test_exception_telemetry.py

**/@(test_*.py|*.test.@(ts|tsx))

📄 CodeRabbit inference engine (coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt)

**/@(test_*.py|*.test.@(ts|tsx)): Check if tests have too many mock objects that obscure what's actually being tested
Warn when mocks are used instead of testing real behavior and interactions
Suggest using real objects or test doubles when mocks become excessive
Ensure mocks are used appropriately for external dependencies, not core logic
Recommend integration tests when unit tests become overly mocked
Test files should have descriptive test function names explaining what is tested
Tests should be organized logically with proper setup and teardown
Include edge cases and error conditions for comprehensive coverage
Verify tests cover both positive and negative scenarios where appropriate
Tests should cover the main functionality being implemented
Ensure tests are not just smoke tests but actually validate behavior
For API endpoints, verify both success and error response testing

Files:

src/backend/tests/integration/test_telemetry_splitting_integration.py
src/backend/tests/unit/services/telemetry/test_component_inputs_splitting.py
src/backend/tests/integration/test_exception_telemetry.py

**/test_*.py

📄 CodeRabbit inference engine (coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt)

**/test_*.py: Check that backend test files follow naming convention: test_.py
Backend tests should be named test_.py and follow proper pytest structure
For async Python code, ensure proper async testing patterns (pytest) are used
Backend tests should follow pytest conventions; frontend tests should use Playwright

Files:

src/backend/tests/integration/test_telemetry_splitting_integration.py
src/backend/tests/unit/services/telemetry/test_component_inputs_splitting.py
src/backend/tests/integration/test_exception_telemetry.py

src/backend/tests/unit/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/backend_development.mdc)

Test component integration within flows using create_flow, build_flow, and get_build_events utilities

Files:

src/backend/tests/unit/services/telemetry/test_component_inputs_splitting.py

src/backend/**/*component*.py

📄 CodeRabbit inference engine (.cursor/rules/icons.mdc)

In your Python component class, set the icon attribute to a string matching the frontend icon mapping exactly (case-sensitive).

Files:

src/backend/tests/unit/services/telemetry/test_component_inputs_splitting.py

🧬 Code graph analysis (6)

src/backend/base/langflow/services/telemetry/service.py (1)

src/backend/base/langflow/services/telemetry/schema.py (2)

ComponentInputsPayload (52-262)

split_if_needed (168-262)

src/backend/tests/integration/test_telemetry_splitting_integration.py (2)

src/backend/base/langflow/services/telemetry/schema.py (1)

ComponentInputsPayload (52-262)

src/backend/base/langflow/services/telemetry/service.py (3)

TelemetryService (33-229)

_queue_event (114-117)

log_package_component_inputs (150-161)

src/backend/base/langflow/services/telemetry/schema.py (1)

src/lfx/src/lfx/custom/custom_component/component.py (1)

log (1505-1522)

src/backend/tests/unit/services/telemetry/test_component_inputs_splitting.py (1)

src/backend/base/langflow/services/telemetry/schema.py (3)

ComponentInputsPayload (52-262)

_calculate_url_size (82-99)

split_if_needed (168-262)

src/backend/tests/integration/test_exception_telemetry.py (3)

src/backend/base/langflow/services/telemetry/schema.py (2)

ComponentPayload (43-49)

ComponentInputsPayload (52-262)

src/lfx/src/lfx/inputs/input_mixin.py (1)

FieldTypes (18-39)

src/lfx/src/lfx/inputs/inputs.py (4)

BoolInput (418-431)

IntInput (346-379)

SecretStrInput (287-343)

StrInput (126-182)

src/backend/base/langflow/api/build.py (3)

src/backend/base/langflow/services/telemetry/schema.py (2)

ComponentInputsPayload (52-262)

ComponentPayload (43-49)

src/lfx/src/lfx/custom/custom_component/component.py (1)

get_telemetry_input_values (560-563)

src/backend/base/langflow/services/telemetry/service.py (2)

log_package_component_inputs (150-161)

log_package_component (147-148)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)

GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
GitHub Check: Test Docker Images / Test docker images
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
GitHub Check: Lint Backend / Run Mypy (3.13)
GitHub Check: Run Backend Tests / LFX Tests - Python 3.10
GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
GitHub Check: Lint Backend / Run Mypy (3.10)
GitHub Check: Lint Backend / Run Mypy (3.11)
GitHub Check: Lint Backend / Run Mypy (3.12)
GitHub Check: Test Starter Templates
GitHub Check: Run Frontend Tests / Determine Test Suites and Shard Distribution
GitHub Check: Optimize new Python code in this PR
GitHub Check: test-starter-projects

🔇 Additional comments (23)

src/backend/base/langflow/initial_setup/starter_projects/Youtube Analysis.json (3)

300-302: Dependency bump looks good.

Incrementing langchain_core to 0.3.79 keeps this starter aligned with the rest of the PR’s ecosystem updates. No further changes needed.

766-768: Agent dependency update acknowledged.

langchain_core 0.3.79 matches the other starter updates and should keep the agent tooling consistent.

1580-1583: FastAPI bump is consistent.

Moving to FastAPI 0.119.0 aligns this starter project with the upgraded baseline elsewhere in the PR.

src/backend/base/langflow/initial_setup/starter_projects/Pokédex Agent.json (1)

381-382: Version bump looks consistent.

The FastAPI and langchain_core updates align with the broader dependency refresh—no issues spotted here.

Also applies to: 1198-1199

src/backend/base/langflow/initial_setup/starter_projects/Hybrid Search RAG.json (1)

714-714: Starter template dependency bump looks good.

The updated fastapi and langchain_core versions align with the telemetry work in this PR. ✅

Also applies to: 1858-1858

src/backend/base/langflow/initial_setup/starter_projects/Travel Planning Agents.json (2)

487-488: fastapi 0.119.0 bump looks good

Nice to see the starter project tracking the latest FastAPI release; no issues spotted with this metadata tweak.

1625-1626: langchain_core version alignment

Thanks for keeping every agent entry in sync at 0.3.79—helps prevent mismatched starter dependencies.

Also applies to: 2266-2267, 2908-2909

src/backend/base/langflow/initial_setup/starter_projects/Knowledge Retrieval.json (1)

244-245: fastapi bump acknowledged

Version pin updated to 0.119.0 here as well—looks consistent with the other starter JSON.

src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json (1)

413-415: FastAPI bump LGTM.

The template’s dependency metadata now points to FastAPI 0.119.0; nothing else changed and this aligns with the broader dependency sweep. Looks good.

src/backend/base/langflow/initial_setup/starter_projects/Market Research.json (1)

464-466: Version updates look consistent.

Both FastAPI 0.119.0 and langchain_core 0.3.79 align with the repo-wide bump; no other metadata shifts observed. All good.

Also applies to: 1506-1510

src/backend/base/langflow/initial_setup/starter_projects/Custom Component Generator.json (1)

2195-2204: FastAPI bump aligns with the starter upgrades.

The dependency metadata now points at FastAPI 0.119.0, matching the repo-wide starter updates mentioned in the PR. No blockers spotted.

src/backend/base/langflow/initial_setup/starter_projects/Social Media Agent.json (1)

160-161: Dependency bumps look good.

The langchain_core and fastapi version bumps are consistent with the wider telemetry changes and align with the rest of the PR. No issues spotted.

Also applies to: 391-392, 960-961, 1250-1251

src/backend/base/langflow/initial_setup/starter_projects/Vector Store RAG.json (2)

1048-1054: Dependency bump looks good.

fastapi → 0.119.0 keeps us on the latest patch without known incompatibilities for this template. ✅

3444-3450: Nice keeping langchain_core current.

Updating to 0.3.79 aligns the starter with the latest langchain releases and should be safe here.

.secrets.baseline (1)

1403-1415: Confirm new baseline entry is a false positive.

Please double-check that the value flagged at src/lfx/src/lfx/inputs/input_mixin.py Line 21 is truly non-sensitive before locking it into the baseline. If it’s intentional, consider adding an inline allowlist pragma (or similar) near the source so we don’t have to maintain it here.

src/backend/base/langflow/initial_setup/starter_projects/Basic Prompt Chaining.json (1)

621-621: Version bump LGTM.

fastapi 0.119.0 is already used elsewhere in this PR; keeping the starter project in sync makes sense.

src/backend/base/langflow/initial_setup/starter_projects/Search agent.json (3)

115-115: scrapegraph_py bump looks good.

Aligning the template with 1.34.0 keeps us current with ScrapeGraph’s API fixes.

556-556: FastAPI version update approved.

0.119.0 matches the rest of the templates, so this keeps dependencies consistent.

905-905: langchain_core patch bump works for me.

Moving to 0.3.79 tracks the upstream fixes without breaking changes.

src/backend/base/langflow/initial_setup/starter_projects/Basic Prompting.json (1)

573-573: FastAPI patch upgrade acknowledged.

Keeping this starter on 0.119.0 preserves parity with the other templates.

src/backend/base/langflow/initial_setup/starter_projects/Research Agent.json (1)

1630-1631: Version bumps look fine; please confirm runtime compatibility

fastapi 0.118.0 → 0.119.0

langchain_core 0.3.78 → 0.3.79

These are minor updates; LGTM. Please confirm starter flow still runs end-to-end with these versions in your environment.

Also applies to: 2538-2539

src/lfx/src/lfx/inputs/inputs.py (1)

285-285: Per-input telemetry flags align with privacy goals

Sensitive types are opted out; safe numeric/boolean/UI types are opted in. LGTM.

Also applies to: 302-302, 357-357, 393-393, 431-431, 497-497, 509-509, 524-524, 570-570, 632-632, 647-647
src/backend/base/langflow/services/telemetry/schema.py (1)
82-100: Ensure encoding used for size calculation matches actual send logic

_split_if_needed uses _calculate_url_size which JSON‑encodes componentInputs, but TelemetryService.send_telemetry_data appears to pass model_dump(...) directly as params. This mismatch can cause chunks to exceed the real URL size at runtime.

Please align behaviors by JSON‑encoding componentInputs at send time, or by adding a model_serializer/params builder on ComponentInputsPayload that ensures consistent encoding across both sizing and sending. Example change (in TelemetryService.send_telemetry_data):
params = payload.model_dump(by_alias=True, exclude_none=True, exclude_unset=True)
if "componentInputs" in params and isinstance(params["componentInputs"], dict):
    import orjson
    params["componentInputs"] = orjson.dumps(params["componentInputs"]).decode("utf-8")
await self.client.get(url, params=params)

coderabbitai · 2025-10-13T20:18:49Z

src/backend/tests/integration/test_exception_telemetry.py

+    def test_component_inputs_payload_creation_and_serialization(self):
+        """Test ComponentInputsPayload creation and serialization."""
+        from langflow.services.telemetry.schema import ComponentInputsPayload
+
+        payload = ComponentInputsPayload(
+            component_run_id="run-abc-123",
+            component_id="OpenAIModel-xyz789",
+            component_name="OpenAIModel",
+            component_inputs='{"temperature":0.7,"model":"gpt-4"}',
+        )
+
+        assert payload.component_run_id == "run-abc-123"
+        assert payload.component_id == "OpenAIModel-xyz789"
+        assert payload.component_name == "OpenAIModel"
+        assert payload.component_inputs == '{"temperature":0.7,"model":"gpt-4"}'
+
+        serialized = payload.model_dump(by_alias=True)
+        expected = {
+            "componentRunId": "run-abc-123",
+            "componentId": "OpenAIModel-xyz789",
+            "componentName": "OpenAIModel",
+            "componentInputs": '{"temperature":0.7,"model":"gpt-4"}',
+            "clientType": None,
+        }
        assert serialized == expected


⚠️ Potential issue | 🔴 Critical

Fix type mismatch: component_inputs must be a dict, not JSON string

Schema defines component_inputs: dict[str, Any]. Passing a JSON string will fail validation and also diverges from splitting/encoding logic.

Apply this diff:

- payload = ComponentInputsPayload( - component_run_id="run-abc-123", - component_id="OpenAIModel-xyz789", - component_name="OpenAIModel", - component_inputs='{"temperature":0.7,"model":"gpt-4"}', - ) + payload = ComponentInputsPayload( + component_run_id="run-abc-123", + component_id="OpenAIModel-xyz789", + component_name="OpenAIModel", + component_inputs={"temperature": 0.7, "model": "gpt-4"}, + ) @@ - expected = { + expected = { "componentRunId": "run-abc-123", "componentId": "OpenAIModel-xyz789", "componentName": "OpenAIModel", - "componentInputs": '{"temperature":0.7,"model":"gpt-4"}', + "componentInputs": {"temperature": 0.7, "model": "gpt-4"}, "clientType": None, }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In src/backend/tests/integration/test_exception_telemetry.py around lines 210 to 234, the test passes component_inputs as a JSON string but the schema defines component_inputs as dict[str, Any]; change the payload creation to pass a Python dict (e.g. {"temperature": 0.7, "model": "gpt-4"}) instead of the JSON string, update the inline assertion that checks payload.component_inputs to expect the dict, and update the expected serialized dict to have componentInputs set to that dict (not a JSON string) so the model_dump(by_alias=True) comparison matches the schema.

coderabbitai · 2025-10-13T20:18:49Z

src/backend/tests/integration/test_exception_telemetry.py

+        run_id = "run-xyz-789"
+
+        # First component
+        payload1 = ComponentInputsPayload(
+            component_run_id=run_id,
+            component_id="Component1-abc",
+            component_name="Component1",
+            component_inputs='{"input1":"value1"}',
+        )
+
+        # Second component with same run_id
+        payload2 = ComponentInputsPayload(
+            component_run_id=run_id,
+            component_id="Component2-def",
+            component_name="Component2",
+            component_inputs='{"input2":"value2"}',
+        )


⚠️ Potential issue | 🔴 Critical

Fix type mismatch in multi‑payload run: use dicts for component_inputs

Both payloads should pass dicts to match schema and splitting behavior.

Apply this diff:

payload1 = ComponentInputsPayload( component_run_id=run_id, component_id="Component1-abc", component_name="Component1", - component_inputs='{"input1":"value1"}', + component_inputs={"input1": "value1"}, ) @@ payload2 = ComponentInputsPayload( component_run_id=run_id, component_id="Component2-def", component_name="Component2", - component_inputs='{"input2":"value2"}', + component_inputs={"input2": "value2"}, )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

run_id = "run-xyz-789"

# First component

payload1 = ComponentInputsPayload(

component_run_id=run_id,

component_id="Component1-abc",

component_name="Component1",

component_inputs='{"input1":"value1"}',

)

# Second component with same run_id

payload2 = ComponentInputsPayload(

component_run_id=run_id,

component_id="Component2-def",

component_name="Component2",

component_inputs='{"input2":"value2"}',

)

run_id = "run-xyz-789"

# First component

payload1 = ComponentInputsPayload(

component_run_id=run_id,

component_id="Component1-abc",

component_name="Component1",

component_inputs={"input1": "value1"},

)

# Second component with same run_id

payload2 = ComponentInputsPayload(

component_run_id=run_id,

component_id="Component2-def",

component_name="Component2",

component_inputs={"input2": "value2"},

)

🤖 Prompt for AI Agents

In src/backend/tests/integration/test_exception_telemetry.py around lines 575-591 the ComponentInputsPayload instances pass JSON strings for component_inputs, causing a type mismatch with the expected schema and splitting logic; change component_inputs from JSON strings to Python dicts for both payload1 and payload2 (e.g. component_inputs={"input1": "value1"} and component_inputs={"input2": "value2"}) so the payloads match the schema and splitting behavior.

coderabbitai · 2025-10-13T20:18:50Z

src/lfx/src/lfx/custom/custom_component/component.py

+    def _should_track_input(self, input_obj: InputTypes) -> bool:
+        """Check if input should be tracked in telemetry."""
+        from lfx.inputs.input_mixin import SENSITIVE_FIELD_TYPES
+
+        # Respect opt-out flag
+        if not getattr(input_obj, "track_in_telemetry", True):
+            return False
+        # Auto-exclude sensitive field types
+        return not (hasattr(input_obj, "field_type") and input_obj.field_type in SENSITIVE_FIELD_TYPES)
+


⚠️ Potential issue | 🟠 Major

Fix opt‑in default to avoid unintended telemetry collection

_getattr(..., True) defaults to tracking when the attribute is absent, which contradicts the documented opt‑in default and can leak values from inputs that don’t declare track_in_telemetry (e.g., fallback Input). Default should be False.

Apply:

- # Respect opt-out flag - if not getattr(input_obj, "track_in_telemetry", True): + # Respect opt-in flag; default to False when absent + if not getattr(input_obj, "track_in_telemetry", False): return False

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _should_track_input(self, input_obj: InputTypes) -> bool:

"""Check if input should be tracked in telemetry."""

from lfx.inputs.input_mixin import SENSITIVE_FIELD_TYPES

# Respect opt-out flag

if not getattr(input_obj, "track_in_telemetry", True):

return False

# Auto-exclude sensitive field types

return not (hasattr(input_obj, "field_type") and input_obj.field_type in SENSITIVE_FIELD_TYPES)

def _should_track_input(self, input_obj: InputTypes) -> bool:

"""Check if input should be tracked in telemetry."""

from lfx.inputs.input_mixin import SENSITIVE_FIELD_TYPES

# Respect opt-in flag; default to False when absent

if not getattr(input_obj, "track_in_telemetry", False):

return False

# Auto-exclude sensitive field types

return not (hasattr(input_obj, "field_type") and input_obj.field_type in SENSITIVE_FIELD_TYPES)

🤖 Prompt for AI Agents

In src/lfx/src/lfx/custom/custom_component/component.py around lines 550 to 559, the getattr call currently defaults missing track_in_telemetry to True which causes opt‑out behavior; change the default to False so inputs that don't declare track_in_telemetry are not tracked by default. Update the check to use getattr(input_obj, "track_in_telemetry", False) (preserving the subsequent logic) and run related tests to ensure sensitive inputs remain excluded.

src/backend/base/langflow/services/telemetry/schema.py

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>

codeflash-ai · 2025-10-13T21:06:16Z

This PR is now faster! 🚀 Gabriel Luiz Freitas Almeida accepted my code suggestion above.

This commit introduces a new function, _log_component_input_telemetry, to centralize the logic for logging component input telemetry. The function is called in two places within the generate_flow_events function, improving code readability and maintainability by reducing duplication. This change enhances the clarity of telemetry handling in the flow generation process.

This commit refines the truncation logic for input values in the ComponentInputsPayload class. The previous binary search method for string values has been simplified, allowing for direct truncation of both string and non-string values. This change enhances code clarity and maintains functionality while ensuring optimal handling of oversized inputs.

* feat: Introduce telemetry tracking for sensitive field types Added a new set of field types that should not be tracked in telemetry due to their sensitive nature, including PASSWORD, AUTH, FILE, CONNECTION, and MCP. Updated relevant input classes to ensure telemetry tracking is disabled for these sensitive fields, enhancing data privacy and security. * feat: Enhance telemetry payloads with additional fields and serialization support Added new fields to the ComponentPayload and ComponentInputsPayload classes, including component_id and component_run_id, to improve telemetry data tracking. Introduced a serialize_input_values function to handle JSON serialization of component input values, ensuring robust handling of input data for telemetry purposes. * feat: Implement telemetry input tracking and caching Added functionality to track and cache telemetry input values within the Component class. Introduced a method to determine if inputs should be tracked based on sensitivity and an accessor for retrieving cached telemetry data, enhancing the robustness of telemetry handling. * feat: Add logging for component input telemetry Introduced a new method, log_package_component_inputs, to the TelemetryService for logging telemetry data related to component inputs. This enhancement improves the tracking capabilities of the telemetry system, allowing for more detailed insights into component interactions. * feat: Enhance telemetry logging for component execution Added functionality to log component input telemetry both during successful execution and error cases. Introduced a unique component_run_id for each execution to improve tracking. This update ensures comprehensive telemetry data collection, enhancing the robustness of the telemetry system. * feat: Extend telemetry payload tests and enhance serialization Added tests for the new component_id and component_run_id fields in ComponentPayload and ComponentInputsPayload classes. Introduced a new test suite for ComponentInputTelemetry, covering serialization of various data types and handling of edge cases. This update improves the robustness and coverage of telemetry data handling in the system. * fix: Update default telemetry tracking behavior in BaseInputMixin Changed the default value of track_in_telemetry from True to False in the BaseInputMixin class. Updated documentation to clarify that telemetry tracking is now opt-in and can be explicitly enabled for individual input types, enhancing data privacy and control. * fix: Update telemetry tracking defaults for input types Modified the default value of `track_in_telemetry` for various input classes to enhance data privacy. Regular inputs now default to False, while safe inputs like `IntInput` and `BoolInput` default to True, ensuring explicit opt-in for telemetry tracking. Updated related tests to reflect these changes. * feat: add chunk_index and total_chunks fields to ComponentInputsPayload This commit adds two new optional fields to ComponentInputsPayload: - chunk_index: Index of this chunk in a split payload sequence - total_chunks: Total number of chunks in the split sequence Both fields default to None and use camelCase aliases for serialization. This is Task 1 of the telemetry query parameter splitting implementation. Tests included: - Verify fields exist and can be set - Verify camelCase serialization aliases work correctly - Verify fields default to None when not provided Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * refactor: update ComponentInputsPayload to support automatic splitting of oversized inputs This commit enhances the ComponentInputsPayload class by implementing functionality to automatically split input values into multiple chunks if they exceed the maximum URL size limit. Key changes include: - Added methods for calculating URL size, truncating oversized values, and splitting payloads. - Updated component_inputs field to accept a dictionary instead of a string for better handling of input values. - Improved documentation for the ComponentInputsPayload class to reflect the new splitting behavior and usage examples. These changes aim to improve telemetry data handling and ensure compliance with URL length restrictions. * refactor: enhance log_package_component_inputs to handle oversized payloads This commit updates the log_package_component_inputs method in the TelemetryService class to split component input payloads into multiple requests if they exceed the maximum URL size limit. Key changes include: - Added logic to split the payload using the new split_if_needed method. - Each chunk is queued separately for telemetry logging. These improvements ensure better handling of telemetry data while adhering to URL length restrictions. * refactor: centralize maximum telemetry URL size constant This commit introduces a centralized constant, MAX_TELEMETRY_URL_SIZE, to define the maximum URL length for telemetry GET requests. Key changes include: - Added MAX_TELEMETRY_URL_SIZE constant to schema.py for better maintainability. - Updated split_if_needed method in ComponentInputsPayload to use the new constant instead of a hardcoded value. - Adjusted the TelemetryService to reference the centralized constant for URL size limits. These changes enhance code clarity and ensure consistent handling of URL size limits across the telemetry service. * refactor: update ComponentInputsPayload tests to use dictionary inputs This commit modifies the tests for ComponentInputsPayload to utilize a dictionary for component inputs instead of a serialized JSON string. Key changes include: - Renamed the test method to reflect the new input type. - Removed unnecessary serialization steps and assertions related to JSON strings. - Added assertions to verify the correct handling of dictionary inputs. These changes streamline the testing process and improve clarity in how component inputs are represented. * test: add integration tests for telemetry service payload splitting This commit introduces integration tests for the TelemetryService to verify its handling of large and small payloads. Key changes include: - Added tests to ensure large payloads are split into multiple chunks and queued correctly. - Implemented a test to confirm that small payloads are not split and result in a single queued event. - Created a mock settings service for testing purposes. These tests enhance the reliability of the telemetry service by ensuring proper payload management. * test: enhance ComponentInputsPayload tests with additional scenarios This commit expands the test suite for ComponentInputsPayload by adding various scenarios to ensure robust handling of input payloads. Key changes include: - Introduced tests for calculating URL size, ensuring it returns a positive integer and accounts for encoding. - Added tests to verify the splitting logic for large payloads, including checks for chunk metadata and preservation of fixed fields. - Implemented property-based tests using Hypothesis to validate that all chunks respect the maximum URL size and preserve original data. These enhancements improve the reliability and coverage of the ComponentInputsPayload tests, ensuring proper functionality under various conditions. * [autofix.ci] apply automated fixes * optimize query param encoding Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> * refactor: extract telemetry logging logic into a separate function This commit introduces a new function, _log_component_input_telemetry, to centralize the logic for logging component input telemetry. The function is called in two places within the generate_flow_events function, improving code readability and maintainability by reducing duplication. This change enhances the clarity of telemetry handling in the flow generation process. * refactor: optimize truncation logic in ComponentInputsPayload This commit refines the truncation logic for input values in the ComponentInputsPayload class. The previous binary search method for string values has been simplified, allowing for direct truncation of both string and non-string values. This change enhances code clarity and maintains functionality while ensuring optimal handling of oversized inputs. * refactor: update telemetry tracking logic to respect opt-in flag This commit modifies the telemetry tracking logic in the Component class to change the default behavior of the `track_in_telemetry` attribute from True to False. This adjustment enhances user privacy by requiring explicit consent for tracking input objects in telemetry. The change ensures that sensitive field types are still auto-excluded from tracking, maintaining the integrity of the telemetry data. * refactor: update tests to use dictionary format for component inputs This commit modifies the integration tests for telemetry payload validation and component input telemetry to utilize dictionaries for component inputs instead of serialized JSON strings. Key changes include: - Updated assertions to compare dictionary inputs directly. - Enhanced clarity and maintainability of the test cases by removing unnecessary serialization steps. These changes improve the representation of component inputs in tests, aligning with recent refactoring efforts. * [autofix.ci] apply automated fixes * refactor: specify type for current_chunk_inputs in ComponentInputsPayload This commit updates the type annotation for the current_chunk_inputs variable in the ComponentInputsPayload class to explicitly define it as a dictionary. This change enhances code clarity and maintainability by providing better type information for developers working with the code. * test: add component_id to ComponentPayload tests This commit enhances the test cases for the ComponentPayload class by adding a component_id parameter to various initialization tests. The updates ensure that the component_id is properly tested across different scenarios, including valid parameters, error messages, and edge cases. This change improves the robustness of the tests and aligns with recent updates to the ComponentPayload structure. * [autofix.ci] apply automated fixes * feat: add component_id to ComponentPayload in build_vertex function * fix: update MAX_TELEMETRY_URL_SIZE to 2048 and adjust related tests This commit increases the maximum URL size for telemetry GET requests from 2000 to 2048 bytes to align with Scarf's specifications. Corresponding test assertions have been updated to reference the new constant, ensuring consistency across the codebase. * [autofix.ci] apply automated fixes * feat(telemetry): add track_in_telemetry field to starter project configurations * refactor(telemetry): remove unused blank line in test imports * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * update starter templates * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: Claude <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>

…parse errors (#10508) * feat: Add ALTK Agent with tool validation and comprehensive tests (#10587) * Add ALTK Agent with tool validation and comprehensive tests - Added agent-lifecycle-toolkit~=0.4.1 dependency to pyproject.toml - Implemented ALTKBaseAgent with comprehensive error handling and tool validation - Added ALTKToolWrappers for SPARC integration and tool execution safety - Created ALTK Agent component with proper LangChain integration - Added comprehensive test suite covering tool validation, conversation context, and edge cases - Fixed docstring formatting to comply with ruff linting standards * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * minor fix to execute_tool that was left out. * Fixes following coderabbitai comments. * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Update component_index.json * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Add custom message to dict conversion in ValidatedTool * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Add Notion integration components to index Updated component_index.json to include new Notion integration components: AddContentToPage, NotionDatabaseProperties, NotionListPages, NotionPageContent, NotionPageCreator, NotionPageUpdate, and NotionSearch. These components provide functionality for interacting with Notion databases and pages, including querying, creating, updating, and retrieving content. * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: Koren Lazar <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Edwin Jose <[email protected]> * feat: Implement dynamic model discovery system (#10523) * add dynamic model request * add description to groq * add cache folder to store cache models json * change git ignore description * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * add comprehensive tests for Groq dynamic model discovery - Add 101 unit tests covering success, error, and edge cases - Test model discovery, caching, tool calling detection - Test fallback models and backward compatibility - Add support for real GROQ_API_KEY from environment - Fix all lint errors and improve code quality * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * fix Python 3.10 compatibility - replace UTC with timezone.utc Python 3.10 doesn't have datetime.UTC, need to use timezone.utc instead * fix pytest hook signature - use config instead of _config * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * fix conftest config.py * fix timezone UTC on tests * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * feat: remove `code` from Transactions to reduce clutter in logs (#10400) * refactor: remove code from transaction model inputs * refactor: remove code from transaction model inputs * tests: add tests to make sure code is not added to transactions data * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * refactor: improve code removal from logs with explicit dict copying --------- Co-authored-by: Edwin Jose <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * feat: new release for cuga component (#10591) * feat: new release of cuga * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * fix: address review * fix: fixed more bugs * fix: build component index * [autofix.ci] apply automated fixes * fix: update test * chore: update component index * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * feat: adds Component Inputs telemetry (#10254) * feat: Introduce telemetry tracking for sensitive field types Added a new set of field types that should not be tracked in telemetry due to their sensitive nature, including PASSWORD, AUTH, FILE, CONNECTION, and MCP. Updated relevant input classes to ensure telemetry tracking is disabled for these sensitive fields, enhancing data privacy and security. * feat: Enhance telemetry payloads with additional fields and serialization support Added new fields to the ComponentPayload and ComponentInputsPayload classes, including component_id and component_run_id, to improve telemetry data tracking. Introduced a serialize_input_values function to handle JSON serialization of component input values, ensuring robust handling of input data for telemetry purposes. * feat: Implement telemetry input tracking and caching Added functionality to track and cache telemetry input values within the Component class. Introduced a method to determine if inputs should be tracked based on sensitivity and an accessor for retrieving cached telemetry data, enhancing the robustness of telemetry handling. * feat: Add logging for component input telemetry Introduced a new method, log_package_component_inputs, to the TelemetryService for logging telemetry data related to component inputs. This enhancement improves the tracking capabilities of the telemetry system, allowing for more detailed insights into component interactions. * feat: Enhance telemetry logging for component execution Added functionality to log component input telemetry both during successful execution and error cases. Introduced a unique component_run_id for each execution to improve tracking. This update ensures comprehensive telemetry data collection, enhancing the robustness of the telemetry system. * feat: Extend telemetry payload tests and enhance serialization Added tests for the new component_id and component_run_id fields in ComponentPayload and ComponentInputsPayload classes. Introduced a new test suite for ComponentInputTelemetry, covering serialization of various data types and handling of edge cases. This update improves the robustness and coverage of telemetry data handling in the system. * fix: Update default telemetry tracking behavior in BaseInputMixin Changed the default value of track_in_telemetry from True to False in the BaseInputMixin class. Updated documentation to clarify that telemetry tracking is now opt-in and can be explicitly enabled for individual input types, enhancing data privacy and control. * fix: Update telemetry tracking defaults for input types Modified the default value of `track_in_telemetry` for various input classes to enhance data privacy. Regular inputs now default to False, while safe inputs like `IntInput` and `BoolInput` default to True, ensuring explicit opt-in for telemetry tracking. Updated related tests to reflect these changes. * feat: add chunk_index and total_chunks fields to ComponentInputsPayload This commit adds two new optional fields to ComponentInputsPayload: - chunk_index: Index of this chunk in a split payload sequence - total_chunks: Total number of chunks in the split sequence Both fields default to None and use camelCase aliases for serialization. This is Task 1 of the telemetry query parameter splitting implementation. Tests included: - Verify fields exist and can be set - Verify camelCase serialization aliases work correctly - Verify fields default to None when not provided Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * refactor: update ComponentInputsPayload to support automatic splitting of oversized inputs This commit enhances the ComponentInputsPayload class by implementing functionality to automatically split input values into multiple chunks if they exceed the maximum URL size limit. Key changes include: - Added methods for calculating URL size, truncating oversized values, and splitting payloads. - Updated component_inputs field to accept a dictionary instead of a string for better handling of input values. - Improved documentation for the ComponentInputsPayload class to reflect the new splitting behavior and usage examples. These changes aim to improve telemetry data handling and ensure compliance with URL length restrictions. * refactor: enhance log_package_component_inputs to handle oversized payloads This commit updates the log_package_component_inputs method in the TelemetryService class to split component input payloads into multiple requests if they exceed the maximum URL size limit. Key changes include: - Added logic to split the payload using the new split_if_needed method. - Each chunk is queued separately for telemetry logging. These improvements ensure better handling of telemetry data while adhering to URL length restrictions. * refactor: centralize maximum telemetry URL size constant This commit introduces a centralized constant, MAX_TELEMETRY_URL_SIZE, to define the maximum URL length for telemetry GET requests. Key changes include: - Added MAX_TELEMETRY_URL_SIZE constant to schema.py for better maintainability. - Updated split_if_needed method in ComponentInputsPayload to use the new constant instead of a hardcoded value. - Adjusted the TelemetryService to reference the centralized constant for URL size limits. These changes enhance code clarity and ensure consistent handling of URL size limits across the telemetry service. * refactor: update ComponentInputsPayload tests to use dictionary inputs This commit modifies the tests for ComponentInputsPayload to utilize a dictionary for component inputs instead of a serialized JSON string. Key changes include: - Renamed the test method to reflect the new input type. - Removed unnecessary serialization steps and assertions related to JSON strings. - Added assertions to verify the correct handling of dictionary inputs. These changes streamline the testing process and improve clarity in how component inputs are represented. * test: add integration tests for telemetry service payload splitting This commit introduces integration tests for the TelemetryService to verify its handling of large and small payloads. Key changes include: - Added tests to ensure large payloads are split into multiple chunks and queued correctly. - Implemented a test to confirm that small payloads are not split and result in a single queued event. - Created a mock settings service for testing purposes. These tests enhance the reliability of the telemetry service by ensuring proper payload management. * test: enhance ComponentInputsPayload tests with additional scenarios This commit expands the test suite for ComponentInputsPayload by adding various scenarios to ensure robust handling of input payloads. Key changes include: - Introduced tests for calculating URL size, ensuring it returns a positive integer and accounts for encoding. - Added tests to verify the splitting logic for large payloads, including checks for chunk metadata and preservation of fixed fields. - Implemented property-based tests using Hypothesis to validate that all chunks respect the maximum URL size and preserve original data. These enhancements improve the reliability and coverage of the ComponentInputsPayload tests, ensuring proper functionality under various conditions. * [autofix.ci] apply automated fixes * optimize query param encoding Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> * refactor: extract telemetry logging logic into a separate function This commit introduces a new function, _log_component_input_telemetry, to centralize the logic for logging component input telemetry. The function is called in two places within the generate_flow_events function, improving code readability and maintainability by reducing duplication. This change enhances the clarity of telemetry handling in the flow generation process. * refactor: optimize truncation logic in ComponentInputsPayload This commit refines the truncation logic for input values in the ComponentInputsPayload class. The previous binary search method for string values has been simplified, allowing for direct truncation of both string and non-string values. This change enhances code clarity and maintains functionality while ensuring optimal handling of oversized inputs. * refactor: update telemetry tracking logic to respect opt-in flag This commit modifies the telemetry tracking logic in the Component class to change the default behavior of the `track_in_telemetry` attribute from True to False. This adjustment enhances user privacy by requiring explicit consent for tracking input objects in telemetry. The change ensures that sensitive field types are still auto-excluded from tracking, maintaining the integrity of the telemetry data. * refactor: update tests to use dictionary format for component inputs This commit modifies the integration tests for telemetry payload validation and component input telemetry to utilize dictionaries for component inputs instead of serialized JSON strings. Key changes include: - Updated assertions to compare dictionary inputs directly. - Enhanced clarity and maintainability of the test cases by removing unnecessary serialization steps. These changes improve the representation of component inputs in tests, aligning with recent refactoring efforts. * [autofix.ci] apply automated fixes * refactor: specify type for current_chunk_inputs in ComponentInputsPayload This commit updates the type annotation for the current_chunk_inputs variable in the ComponentInputsPayload class to explicitly define it as a dictionary. This change enhances code clarity and maintainability by providing better type information for developers working with the code. * test: add component_id to ComponentPayload tests This commit enhances the test cases for the ComponentPayload class by adding a component_id parameter to various initialization tests. The updates ensure that the component_id is properly tested across different scenarios, including valid parameters, error messages, and edge cases. This change improves the robustness of the tests and aligns with recent updates to the ComponentPayload structure. * [autofix.ci] apply automated fixes * feat: add component_id to ComponentPayload in build_vertex function * fix: update MAX_TELEMETRY_URL_SIZE to 2048 and adjust related tests This commit increases the maximum URL size for telemetry GET requests from 2000 to 2048 bytes to align with Scarf's specifications. Corresponding test assertions have been updated to reference the new constant, ensuring consistency across the codebase. * [autofix.ci] apply automated fixes * feat(telemetry): add track_in_telemetry field to starter project configurations * refactor(telemetry): remove unused blank line in test imports * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * update starter templates * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: Claude <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> * feat: Add gpt-5.1 model to Language models (#10590) * Add gpt-5.1 model to starter projects Added 'gpt-5.1' to the list of available models in all starter project JSON files to support the new model version. This update ensures users can select gpt-5.1 in agent configurations. * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Update component_index.json * [autofix.ci] apply automated fixes * Update component_index.json * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * fix: Use the proper Embeddings import for Qdrant vector store (#10613) * Use the proper Embeddings import for Qdrant vector store * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: Madhavan <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * fix: Ensure split text test is more robust (#10622) * fix: Ensure split text test is more robust * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * fix: use issubclass in the pool creation (#10232) * use issubclass in the pool creation * [autofix.ci] apply automated fixes * add poolclass pytests * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: Hamza Rashid <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: cristhianzl <[email protected]> * feat: make it possible to load graphs using `get_graph` function in scripts (#9913) * feat: Enhance graph loading functionality to support async retrieval - Updated `load_graph_from_script` to be an async function, allowing for the retrieval of the graph via an async `get_graph` function if available. - Implemented fallback to the existing `graph` variable for backward compatibility. - Enhanced `find_graph_variable` to identify both `get_graph` function definitions and `graph` variable assignments, improving flexibility in script handling. * feat: Update load_graph_from_script to support async graph retrieval - Refactored `load_graph_from_script` to be an async function, enabling the use of an async `get_graph` function for graph retrieval. - Implemented a fallback mechanism to access the `graph` variable for backward compatibility. - Enhanced error handling to provide clearer messages when neither `graph` nor `get_graph()` is found in the script. * feat: Refactor simple_agent.py to support async graph creation - Introduced an async `get_graph` function to handle the initialization of components and graph creation without blocking. - Updated the logging configuration and component setup to be part of the async function, improving the overall flow and responsiveness. - Enhanced documentation for the `get_graph` function to clarify its purpose and return type. * feat: Update serve_command and run functions to support async graph loading - Refactored `serve_command` to be an async function using `syncify`, allowing for non-blocking execution. - Updated calls to `load_graph_from_path` and `load_graph_from_script` within `serve_command` and `run` to await their results, enhancing performance and responsiveness. - Improved overall async handling in the CLI commands for better integration with async workflows. * feat: Refactor load_graph_from_path to support async execution - Changed `load_graph_from_path` to an async function, enabling non-blocking graph loading. - Updated the call to `load_graph_from_script` to use await, improving performance during graph retrieval. - Enhanced the overall async handling in the CLI for better integration with async workflows. * feat: Enhance async handling in simple_agent and related tests - Updated `get_graph` function in `simple_agent.py` to utilize async component initialization for improved responsiveness. - Modified test cases in `test_simple_agent_in_lfx_run.py` to validate the async behavior of `get_graph`. - Refactored various test functions across multiple files to support async execution, ensuring compatibility with the new async workflows. - Improved documentation for async functions to clarify their purpose and usage. * docs: Implement async get_graph function for improved component initialization - Introduced an async `get_graph` function in `README.md` to facilitate non-blocking component initialization. - Enhanced the logging configuration and component setup within the async function, ensuring a smoother flow. - Updated documentation to clarify the purpose and return type of the `get_graph` function, aligning with the async handling improvements. * refactor: reorder imports in simple_agent test file * style: reorder imports in simple_agent test file * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * update component index * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * feat: Add ALTK Agent with tool validation and comprehensive tests (#10587) * Add ALTK Agent with tool validation and comprehensive tests - Added agent-lifecycle-toolkit~=0.4.1 dependency to pyproject.toml - Implemented ALTKBaseAgent with comprehensive error handling and tool validation - Added ALTKToolWrappers for SPARC integration and tool execution safety - Created ALTK Agent component with proper LangChain integration - Added comprehensive test suite covering tool validation, conversation context, and edge cases - Fixed docstring formatting to comply with ruff linting standards * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * minor fix to execute_tool that was left out. * Fixes following coderabbitai comments. * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Update component_index.json * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Add custom message to dict conversion in ValidatedTool * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Add Notion integration components to index Updated component_index.json to include new Notion integration components: AddContentToPage, NotionDatabaseProperties, NotionListPages, NotionPageContent, NotionPageCreator, NotionPageUpdate, and NotionSearch. These components provide functionality for interacting with Notion databases and pages, including querying, creating, updating, and retrieving content. * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: Koren Lazar <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Edwin Jose <[email protected]> * feat: Implement dynamic model discovery system (#10523) * add dynamic model request * add description to groq * add cache folder to store cache models json * change git ignore description * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * add comprehensive tests for Groq dynamic model discovery - Add 101 unit tests covering success, error, and edge cases - Test model discovery, caching, tool calling detection - Test fallback models and backward compatibility - Add support for real GROQ_API_KEY from environment - Fix all lint errors and improve code quality * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * fix Python 3.10 compatibility - replace UTC with timezone.utc Python 3.10 doesn't have datetime.UTC, need to use timezone.utc instead * fix pytest hook signature - use config instead of _config * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * fix conftest config.py * fix timezone UTC on tests * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * feat: remove `code` from Transactions to reduce clutter in logs (#10400) * refactor: remove code from transaction model inputs * refactor: remove code from transaction model inputs * tests: add tests to make sure code is not added to transactions data * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * refactor: improve code removal from logs with explicit dict copying --------- Co-authored-by: Edwin Jose <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * feat: new release for cuga component (#10591) * feat: new release of cuga * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * fix: address review * fix: fixed more bugs * fix: build component index * [autofix.ci] apply automated fixes * fix: update test * chore: update component index * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * feat: adds Component Inputs telemetry (#10254) * feat: Introduce telemetry tracking for sensitive field types Added a new set of field types that should not be tracked in telemetry due to their sensitive nature, including PASSWORD, AUTH, FILE, CONNECTION, and MCP. Updated relevant input classes to ensure telemetry tracking is disabled for these sensitive fields, enhancing data privacy and security. * feat: Enhance telemetry payloads with additional fields and serialization support Added new fields to the ComponentPayload and ComponentInputsPayload classes, including component_id and component_run_id, to improve telemetry data tracking. Introduced a serialize_input_values function to handle JSON serialization of component input values, ensuring robust handling of input data for telemetry purposes. * feat: Implement telemetry input tracking and caching Added functionality to track and cache telemetry input values within the Component class. Introduced a method to determine if inputs should be tracked based on sensitivity and an accessor for retrieving cached telemetry data, enhancing the robustness of telemetry handling. * feat: Add logging for component input telemetry Introduced a new method, log_package_component_inputs, to the TelemetryService for logging telemetry data related to component inputs. This enhancement improves the tracking capabilities of the telemetry system, allowing for more detailed insights into component interactions. * feat: Enhance telemetry logging for component execution Added functionality to log component input telemetry both during successful execution and error cases. Introduced a unique component_run_id for each execution to improve tracking. This update ensures comprehensive telemetry data collection, enhancing the robustness of the telemetry system. * feat: Extend telemetry payload tests and enhance serialization Added tests for the new component_id and component_run_id fields in ComponentPayload and ComponentInputsPayload classes. Introduced a new test suite for ComponentInputTelemetry, covering serialization of various data types and handling of edge cases. This update improves the robustness and coverage of telemetry data handling in the system. * fix: Update default telemetry tracking behavior in BaseInputMixin Changed the default value of track_in_telemetry from True to False in the BaseInputMixin class. Updated documentation to clarify that telemetry tracking is now opt-in and can be explicitly enabled for individual input types, enhancing data privacy and control. * fix: Update telemetry tracking defaults for input types Modified the default value of `track_in_telemetry` for various input classes to enhance data privacy. Regular inputs now default to False, while safe inputs like `IntInput` and `BoolInput` default to True, ensuring explicit opt-in for telemetry tracking. Updated related tests to reflect these changes. * feat: add chunk_index and total_chunks fields to ComponentInputsPayload This commit adds two new optional fields to ComponentInputsPayload: - chunk_index: Index of this chunk in a split payload sequence - total_chunks: Total number of chunks in the split sequence Both fields default to None and use camelCase aliases for serialization. This is Task 1 of the telemetry query parameter splitting implementation. Tests included: - Verify fields exist and can be set - Verify camelCase serialization aliases work correctly - Verify fields default to None when not provided Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * refactor: update ComponentInputsPayload to support automatic splitting of oversized inputs This commit enhances the ComponentInputsPayload class by implementing functionality to automatically split input values into multiple chunks if they exceed the maximum URL size limit. Key changes include: - Added methods for calculating URL size, truncating oversized values, and splitting payloads. - Updated component_inputs field to accept a dictionary instead of a string for better handling of input values. - Improved documentation for the ComponentInputsPayload class to reflect the new splitting behavior and usage examples. These changes aim to improve telemetry data handling and ensure compliance with URL length restrictions. * refactor: enhance log_package_component_inputs to handle oversized payloads This commit updates the log_package_component_inputs method in the TelemetryService class to split component input payloads into multiple requests if they exceed the maximum URL size limit. Key changes include: - Added logic to split the payload using the new split_if_needed method. - Each chunk is queued separately for telemetry logging. These improvements ensure better handling of telemetry data while adhering to URL length restrictions. * refactor: centralize maximum telemetry URL size constant This commit introduces a centralized constant, MAX_TELEMETRY_URL_SIZE, to define the maximum URL length for telemetry GET requests. Key changes include: - Added MAX_TELEMETRY_URL_SIZE constant to schema.py for better maintainability. - Updated split_if_needed method in ComponentInputsPayload to use the new constant instead of a hardcoded value. - Adjusted the TelemetryService to reference the centralized constant for URL size limits. These changes enhance code clarity and ensure consistent handling of URL size limits across the telemetry service. * refactor: update ComponentInputsPayload tests to use dictionary inputs This commit modifies the tests for ComponentInputsPayload to utilize a dictionary for component inputs instead of a serialized JSON string. Key changes include: - Renamed the test method to reflect the new input type. - Removed unnecessary serialization steps and assertions related to JSON strings. - Added assertions to verify the correct handling of dictionary inputs. These changes streamline the testing process and improve clarity in how component inputs are represented. * test: add integration tests for telemetry service payload splitting This commit introduces integration tests for the TelemetryService to verify its handling of large and small payloads. Key changes include: - Added tests to ensure large payloads are split into multiple chunks and queued correctly. - Implemented a test to confirm that small payloads are not split and result in a single queued event. - Created a mock settings service for testing purposes. These tests enhance the reliability of the telemetry service by ensuring proper payload management. * test: enhance ComponentInputsPayload tests with additional scenarios This commit expands the test suite for ComponentInputsPayload by adding various scenarios to ensure robust handling of input payloads. Key changes include: - Introduced tests for calculating URL size, ensuring it returns a positive integer and accounts for encoding. - Added tests to verify the splitting logic for large payloads, including checks for chunk metadata and preservation of fixed fields. - Implemented property-based tests using Hypothesis to validate that all chunks respect the maximum URL size and preserve original data. These enhancements improve the reliability and coverage of the ComponentInputsPayload tests, ensuring proper functionality under various conditions. * [autofix.ci] apply automated fixes * optimize query param encoding Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> * refactor: extract telemetry logging logic into a separate function This commit introduces a new function, _log_component_input_telemetry, to centralize the logic for logging component input telemetry. The function is called in two places within the generate_flow_events function, improving code readability and maintainability by reducing duplication. This change enhances the clarity of telemetry handling in the flow generation process. * refactor: optimize truncation logic in ComponentInputsPayload This commit refines the truncation logic for input values in the ComponentInputsPayload class. The previous binary search method for string values has been simplified, allowing for direct truncation of both string and non-string values. This change enhances code clarity and maintains functionality while ensuring optimal handling of oversized inputs. * refactor: update telemetry tracking logic to respect opt-in flag This commit modifies the telemetry tracking logic in the Component class to change the default behavior of the `track_in_telemetry` attribute from True to False. This adjustment enhances user privacy by requiring explicit consent for tracking input objects in telemetry. The change ensures that sensitive field types are still auto-excluded from tracking, maintaining the integrity of the telemetry data. * refactor: update tests to use dictionary format for component inputs This commit modifies the integration tests for telemetry payload validation and component input telemetry to utilize dictionaries for component inputs instead of serialized JSON strings. Key changes include: - Updated assertions to compare dictionary inputs directly. - Enhanced clarity and maintainability of the test cases by removing unnecessary serialization steps. These changes improve the representation of component inputs in tests, aligning with recent refactoring efforts. * [autofix.ci] apply automated fixes * refactor: specify type for current_chunk_inputs in ComponentInputsPayload This commit updates the type annotation for the current_chunk_inputs variable in the ComponentInputsPayload class to explicitly define it as a dictionary. This change enhances code clarity and maintainability by providing better type information for developers working with the code. * test: add component_id to ComponentPayload tests This commit enhances the test cases for the ComponentPayload class by adding a component_id parameter to various initialization tests. The updates ensure that the component_id is properly tested across different scenarios, including valid parameters, error messages, and edge cases. This change improves the robustness of the tests and aligns with recent updates to the ComponentPayload structure. * [autofix.ci] apply automated fixes * feat: add component_id to ComponentPayload in build_vertex function * fix: update MAX_TELEMETRY_URL_SIZE to 2048 and adjust related tests This commit increases the maximum URL size for telemetry GET requests from 2000 to 2048 bytes to align with Scarf's specifications. Corresponding test assertions have been updated to reference the new constant, ensuring consistency across the codebase. * [autofix.ci] apply automated fixes * feat(telemetry): add track_in_telemetry field to starter project configurations * refactor(telemetry): remove unused blank line in test imports * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * update starter templates * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: Claude <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> * feat: Add gpt-5.1 model to Language models (#10590) * Add gpt-5.1 model to starter projects Added 'gpt-5.1' to the list of available models in all starter project JSON files to support the new model version. This update ensures users can select gpt-5.1 in agent configurations. * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Update component_index.json * [autofix.ci] apply automated fixes * Update component_index.json * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * fix: Use the proper Embeddings import for Qdrant vector store (#10613) * Use the proper Embeddings import for Qdrant vector store * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: Madhavan <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * fix: Ensure split text test is more robust (#10622) * fix: Ensure split text test is more robust * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * fix: use issubclass in the pool creation (#10232) * use issubclass in the pool creation * [autofix.ci] apply automated fixes * add poolclass pytests * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: Hamza Rashid <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: cristhianzl <[email protected]> * feat: make it possible to load graphs using `get_graph` function in scripts (#9913) * feat: Enhance graph loading functionality to support async retrieval - Updated `load_graph_from_script` to be an async function, allowing for the retrieval of the graph via an async `get_graph` function if available. - Implemented fallback to the existing `graph` variable for backward compatibility. - Enhanced `find_graph_variable` to identify both `get_graph` function definitions and `graph` variable assignments, improving flexibility in script handling. * feat: Update load_graph_from_script to support async graph retrieval - Refactored `load_graph_from_script` to be an async function, enabling the use of an async `get_graph` function for graph retrieval. - Implemented a fallback mechanism to access the `graph` variable for backward compatibility. - Enhanced error handling to provide clearer messages when neither `graph` nor `get_graph()` is found in the script. * feat: Refactor simple_agent.py to support async graph creation - Introduced an async `get_graph` function to handle the initialization of components and graph creation without blocking. - Updated the logging configuration and component setup to be part of the async function, improving the overall flow and responsiveness. - Enhanced documentation for the `get_graph` function to clarify its purpose and return type. * feat: Update serve_command and run functions to support async graph loading - Refactored `serve_command` to be an async function using `syncify`, allowing for non-blocking execution. - Updated calls to `load_graph_from_path` and `load_graph_from_script` within `serve_command` and `run` to await their results, enhancing performance and responsiveness. - Improved overall async handling in the CLI commands for better integration with async workflows. * feat: Refactor load_graph_from_path to support async execution - Changed `load_graph_from_path` to an async function, enabling non-blocking graph loading. - Updated the call to `load_graph_from_script` to use await, improving performance during graph retrieval. - Enhanced the overall async handling in the CLI for better integration with async workflows. * feat: Enhance async handling in simple_agent and related tests - Updated `get_graph` function in `simple_agent.py` to utilize async component initialization for improved responsiveness. - Modified test cases in `test_simple_agent_in_lfx_run.py` to validate the async behavior of `get_graph`. - Refactored various test functions across multiple files to support async execution, ensuring compatibility with the new async workflows. - Improved documentation for async functions to clarify their purpose and usage. * docs: Implement async get_graph function for improved component initialization - Introduced an async `get_graph` function in `README.md` to facilitate non-blocking component initialization. - Enhanced the logging configuration and component setup within the async function, ensuring a smoother flow. - Updated documentation to clarify the purpose and return type of the `get_graph` function, aligning with the async handling improvements. * refactor: reorder imports in simple_agent test file * style: reorder imports in simple_agent test file * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * update component index * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * fix: prevent UI from getting stuck when switching to cURL mode after parse errors * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes --------- Co-authored-by: Koren Lazar <[email protected]> Co-authored-by: Koren Lazar <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Edwin Jose <[email protected]> Co-authored-by: Cristhian Zanforlin Lousa <[email protected]> Co-authored-by: Gabriel Luiz Freitas Almeida <[email protected]> Co-authored-by: Sami Marreed <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> Co-authored-by: Madhavan <[email protected]> Co-authored-by: Madhavan <[email protected]> Co-authored-by: Eric Hare <[email protected]> Co-authored-by: ming <[email protected]> Co-authored-by: Hamza Rashid <[email protected]>

ogabrielluiz and others added 16 commits October 6, 2025 16:15

Merge branch 'main' into per-component-telemetry-payload

fbaedfc

ogabrielluiz requested review from Adam-Aghili, jordanrfrazier and ricofurtado October 13, 2025 20:05

[autofix.ci] apply automated fixes

2594e23

github-actions bot added the enhancement New feature or request label Oct 13, 2025

coderabbitai bot reviewed Oct 13, 2025

View reviewed changes

codeflash-ai bot reviewed Oct 13, 2025

View reviewed changes

src/backend/base/langflow/services/telemetry/schema.py Outdated Show resolved Hide resolved

optimize query param encoding

8098fff

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>

github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 13, 2025

ogabrielluiz added 2 commits October 13, 2025 18:16

github-actions bot added the enhancement New feature or request label Nov 14, 2025

[autofix.ci] apply automated fixes

21b924b