Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Dec 10, 2025

⚡️ This pull request contains optimizations for PR #10953

If you approve this dependent PR, these changes will be merged into the original PR branch fix/folders_download.

This PR will be automatically closed if the original PR is merged.


📄 1,131% (11.31x) speedup for get_cache_service in src/backend/base/langflow/services/deps.py

⏱️ Runtime : 3.09 milliseconds 251 microseconds (best of 146 runs)

📝 Explanation and details

The optimized code achieves an 11x speedup (1131%) by implementing two key caching strategies that eliminate redundant expensive operations:

1. Module-level Service Manager Caching
The original code calls get_service_manager() on every invocation, which is expensive. The optimization introduces a global _service_manager_cache that stores the service manager instance after first initialization. This eliminates:

  • Repeated imports of lfx.services.manager
  • Repeated get_service_manager() calls
  • Redundant factory registration checks and calls

From the profiler data, we see the import and service manager creation (get_service_manager()) now only happen once instead of 38 times, reducing overhead significantly.

2. Factory Instance Caching in get_cache_service
The original code creates a new CacheServiceFactory() instance on every call. The optimization caches this factory as a function attribute using hasattr check, since the same factory instance can be reused safely.

Performance Impact Analysis
The line profiler shows the optimization maintains the same core expensive operation (service_manager.get() taking ~75% of time) while dramatically reducing setup overhead. The service manager initialization and factory registration still take ~24% of time but now occur only once rather than repeatedly.

Test Case Performance
Based on the annotated tests, this optimization particularly benefits:

  • Repeated calls scenarios (like test_cache_service_reuse_instance) where the same service is requested multiple times
  • Large-scale operations (like test_cache_service_many_keys with 500 operations) where service retrieval happens frequently
  • Performance-critical paths where cache service access is repeated

The optimization is especially valuable in applications where get_cache_service() or get_service() are called frequently, as the first-call initialization cost is amortized across all subsequent calls.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 38 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import Union

# imports
import pytest  # used for our unit tests
from langflow.services.deps import get_cache_service

# --- Begin: Minimal implementation of required classes/enums for testing ---
# These are minimal, in-memory implementations to allow actual testing of get_cache_service.
# They are NOT mocks/stubs, but simple real implementations.

class ServiceType:
    CACHE_SERVICE = "CACHE_SERVICE"

class CacheService:
    def __init__(self):
        self._cache = {}

    def set(self, key, value):
        self._cache[key] = value

    def get(self, key, default=None):
        return self._cache.get(key, default)

    def clear(self):
        self._cache.clear()

    def __len__(self):
        return len(self._cache)

class AsyncBaseCacheService(CacheService):
    # For testing, just inherits CacheService
    pass

class CacheServiceFactory:
    def __call__(self):
        # For testing, always return a CacheService instance
        return CacheService()
from langflow.services.deps import get_cache_service

# unit tests

# --- Basic Test Cases ---

def test_cache_service_type_and_instance():
    # Should return an instance of CacheService or AsyncBaseCacheService
    codeflash_output = get_cache_service(); service = codeflash_output

def test_cache_service_set_and_get():
    # Should store and retrieve values correctly
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    service.set("foo", "bar")

def test_cache_service_clear():
    # Should clear all values
    codeflash_output = get_cache_service(); service = codeflash_output
    service.set("a", 1)
    service.set("b", 2)
    service.clear()

def test_cache_service_len():
    # Should return correct length
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    service.set("x", 10)
    service.set("y", 20)

# --- Edge Test Cases ---

def test_cache_service_none_key_and_value():
    # Should handle None as key and value
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    service.set(None, "value_for_none")
    service.set("none_value", None)

def test_cache_service_empty_string_key_and_value():
    # Should handle empty string as key and value
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    service.set("", "empty_key")
    service.set("empty_value", "")

def test_cache_service_overwrite_value():
    # Should overwrite values for same key
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    service.set("dup", 1)
    service.set("dup", 2)

def test_cache_service_different_types():
    # Should handle different types as values
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    service.set("int", 123)
    service.set("float", 3.14)
    service.set("list", [1,2,3])
    service.set("dict", {"a": 1})

def test_cache_service_large_key():
    # Should handle very large keys
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    large_key = "x" * 512
    service.set(large_key, "large_key_value")

def test_cache_service_large_value():
    # Should handle very large values
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    large_value = "y" * 1024
    service.set("large_value", large_value)

def test_cache_service_unicode_key_and_value():
    # Should handle unicode keys and values
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    service.set("ключ", "значение")

def test_cache_service_boolean_keys():
    # Should handle boolean keys
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    service.set(True, "yes")
    service.set(False, "no")

def test_cache_service_tuple_key():
    # Should handle tuple as key
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    tup = (1,2,3)
    service.set(tup, "tuple_value")

def test_cache_service_missing_key():
    # Should return None if key not present
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()

# --- Large Scale Test Cases ---

def test_cache_service_many_keys():
    # Should handle many keys efficiently
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    N = 500  # Large but under 1000
    for i in range(N):
        service.set(f"key_{i}", i)
    for i in range(N):
        pass

def test_cache_service_large_values():
    # Should handle large values efficiently
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    N = 100
    large_value = "z" * 1000  # 1000 characters
    for i in range(N):
        service.set(f"big_{i}", large_value)
    for i in range(N):
        pass

def test_cache_service_clear_large_scale():
    # Should clear large cache efficiently
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    N = 500
    for i in range(N):
        service.set(f"key_{i}", i)
    service.clear()
    for i in range(N):
        pass

def test_cache_service_reuse_instance():
    # Should return the same instance on repeated calls
    codeflash_output = get_cache_service(); s1 = codeflash_output
    codeflash_output = get_cache_service(); s2 = codeflash_output

def test_cache_service_thread_safety_simulation():
    # Simulate concurrent access (not real threads, but interleaved calls)
    codeflash_output = get_cache_service(); service = codeflash_output
    service.clear()
    for i in range(100):
        service.set(f"t{i}", i)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Union

# imports
import pytest  # used for our unit tests
from langflow.services.deps import get_cache_service


# Minimal stubs for dependencies to allow the tests to run
class CacheService:
    def __init__(self):
        self.storage = {}

    def set(self, key, value):
        self.storage[key] = value

    def get(self, key, default=None):
        return self.storage.get(key, default)

class AsyncBaseCacheService:
    def __init__(self):
        self.storage = {}

    async def set(self, key, value):
        self.storage[key] = value

    async def get(self, key, default=None):
        return self.storage.get(key, default)

class ServiceType:
    CACHE_SERVICE = "cache_service"
    OTHER_SERVICE = "other_service"

class CacheServiceFactory:
    def __call__(self):
        # Always returns a new CacheService instance for testing
        return CacheService()

class ServiceManager:
    def __init__(self):
        self._factories_registered = False
        self._services = {}

    def are_factories_registered(self):
        return self._factories_registered

    def register_factories(self, factories):
        self._factories_registered = True
        self._services = {ServiceType.CACHE_SERVICE: factories[ServiceType.CACHE_SERVICE]()}

    def get(self, service_type, default=None):
        if service_type in self._services:
            return self._services[service_type]
        elif default is not None:
            return default()
        else:
            raise KeyError("Service not found")

    @staticmethod
    def get_factories():
        # Returns a dict mapping service types to factory callables
        return {ServiceType.CACHE_SERVICE: CacheServiceFactory}

# Global singleton for the service manager, simulating the real environment
_service_manager_instance = None

def get_service_manager():
    global _service_manager_instance
    if _service_manager_instance is None:
        _service_manager_instance = ServiceManager()
    return _service_manager_instance
from langflow.services.deps import get_cache_service

# 1. Basic Test Cases

def test_get_cache_service_returns_cache_service_instance():
    """Test that get_cache_service returns an instance of CacheService."""
    codeflash_output = get_cache_service(); cache = codeflash_output

def test_cache_service_set_and_get():
    """Test basic set/get functionality of the cache service."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    cache.set("foo", "bar")

def test_cache_service_get_missing_key_returns_none():
    """Test that getting a missing key returns None by default."""
    codeflash_output = get_cache_service(); cache = codeflash_output

def test_cache_service_get_missing_key_with_default():
    """Test that getting a missing key returns the provided default value."""
    codeflash_output = get_cache_service(); cache = codeflash_output

# 2. Edge Test Cases

def test_cache_service_set_none_key_and_value():
    """Test setting and getting None as key and value."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    cache.set(None, None)

def test_cache_service_set_empty_string_key_and_value():
    """Test setting and getting empty string as key and value."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    cache.set("", "")

def test_cache_service_overwrite_value():
    """Test overwriting a value for a given key."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    cache.set("key", "value1")
    cache.set("key", "value2")

def test_cache_service_multiple_types():
    """Test storing and retrieving different types of values."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    cache.set("int", 123)
    cache.set("float", 3.14)
    cache.set("list", [1, 2, 3])
    cache.set("dict", {"a": 1})

def test_cache_service_unicode_and_special_characters():
    """Test that unicode and special characters are handled as keys and values."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    cache.set("üñîçødë", "välüé✨")

def test_cache_service_large_key_and_value():
    """Test handling of large key and value strings."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    large_key = "k" * 1000
    large_value = "v" * 1000
    cache.set(large_key, large_value)

def test_cache_service_key_collision():
    """Test that keys with same value but different types are distinct."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    cache.set("1", "str_one")
    cache.set(1, "int_one")

# 3. Large Scale Test Cases

def test_cache_service_many_keys():
    """Test storing and retrieving a large number of keys."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    num_keys = 500  # Keep under 1000 as per instructions
    for i in range(num_keys):
        cache.set(f"key_{i}", i)
    for i in range(num_keys):
        pass

def test_cache_service_large_values():
    """Test storing and retrieving large values."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    large_value = "x" * 10000  # Large but reasonable string
    cache.set("large", large_value)

def test_cache_service_performance_under_load():
    """Test that cache service can handle rapid set/get operations."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    keys = [f"k{i}" for i in range(500)]
    values = [f"v{i}" for i in range(500)]
    for k, v in zip(keys, values):
        cache.set(k, v)
    for k, v in zip(keys, values):
        pass

def test_cache_service_no_cross_test_leakage():
    """Ensure that cache is reset between tests (test isolation)."""
    codeflash_output = get_cache_service(); cache = codeflash_output

# Additional edge: Test that get_cache_service always returns a fresh instance
def test_get_cache_service_returns_new_instance_each_time():
    """Test that each call to get_cache_service returns a new instance (simulating singleton behavior)."""
    codeflash_output = get_cache_service(); cache1 = codeflash_output
    codeflash_output = get_cache_service(); cache2 = codeflash_output

# Edge: Test that default factory is used if service not found


def test_cache_service_boolean_keys_and_values():
    """Test storing and retrieving boolean keys and values."""
    codeflash_output = get_cache_service(); cache = codeflash_output
    cache.set(True, False)
    cache.set(False, True)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr10953-2025-12-10T21.56.05 and push.

Codeflash

lucaseduoli and others added 12 commits December 10, 2025 14:40
… to use Union for better clarity and compatibility
The optimized code achieves an **11x speedup** (1131%) by implementing two key caching strategies that eliminate redundant expensive operations:

**1. Module-level Service Manager Caching**
The original code calls `get_service_manager()` on every invocation, which is expensive. The optimization introduces a global `_service_manager_cache` that stores the service manager instance after first initialization. This eliminates:
- Repeated imports of `lfx.services.manager` 
- Repeated `get_service_manager()` calls
- Redundant factory registration checks and calls

From the profiler data, we see the import and service manager creation (`get_service_manager()`) now only happen once instead of 38 times, reducing overhead significantly.

**2. Factory Instance Caching in get_cache_service**
The original code creates a new `CacheServiceFactory()` instance on every call. The optimization caches this factory as a function attribute using `hasattr` check, since the same factory instance can be reused safely.

**Performance Impact Analysis**
The line profiler shows the optimization maintains the same core expensive operation (`service_manager.get()` taking ~75% of time) while dramatically reducing setup overhead. The service manager initialization and factory registration still take ~24% of time but now occur only once rather than repeatedly.

**Test Case Performance**
Based on the annotated tests, this optimization particularly benefits:
- **Repeated calls scenarios** (like `test_cache_service_reuse_instance`) where the same service is requested multiple times
- **Large-scale operations** (like `test_cache_service_many_keys` with 500 operations) where service retrieval happens frequently
- **Performance-critical paths** where cache service access is repeated

The optimization is especially valuable in applications where `get_cache_service()` or `get_service()` are called frequently, as the first-call initialization cost is amortized across all subsequent calls.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Dec 10, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 10, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added the community Pull Request from an external contributor label Dec 10, 2025
@codecov
Copy link

codecov bot commented Dec 10, 2025

Codecov Report

❌ Patch coverage is 84.21053% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 33.06%. Comparing base (b6ed2bc) to head (014e5bd).
⚠️ Report is 1 commits behind head on release-1.7.0.

Files with missing lines Patch % Lines
src/backend/base/langflow/services/deps.py 83.33% 3 Missing ⚠️

❌ Your project status has failed because the head coverage (40.02%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff               @@
##           release-1.7.0   #10964   +/-   ##
==============================================
  Coverage          33.05%   33.06%           
==============================================
  Files               1368     1368           
  Lines              63815    63822    +7     
  Branches            9391     9391           
==============================================
+ Hits               21093    21100    +7     
  Misses             41679    41679           
  Partials            1043     1043           
Flag Coverage Δ
backend 52.79% <84.21%> (+0.01%) ⬆️
lfx 40.02% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/backend/base/langflow/api/utils/core.py 62.44% <100.00%> (ø)
src/backend/base/langflow/services/deps.py 83.56% <83.33%> (+0.22%) ⬆️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Base automatically changed from fix/folders_download to release-1.7.0 December 10, 2025 22:23
@codeflash-ai codeflash-ai bot closed this Dec 10, 2025
@codeflash-ai
Copy link
Contributor Author

codeflash-ai bot commented Dec 10, 2025

This PR has been automatically closed because the original PR #10953 by lucaseduoli was closed.

@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr10953-2025-12-10T21.56.05 branch December 10, 2025 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI community Pull Request from an external contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants