⚡️ Speed up function `_serialize_dispatcher` by 22% in PR #6028 (`PlaygroundPage`) #6165

codeflash-ai · 2025-02-06T17:29:59Z

⚡️ This pull request contains optimizations for PR #6028

If you approve this dependent PR, these changes will be merged into the original PR branch PlaygroundPage.

This PR will be automatically closed if the original PR is merged.

📄 22% (0.22x) speedup for `_serialize_dispatcher` in `src/backend/base/langflow/serialization/serialization.py`

⏱️ Runtime : 1.47 millisecond → 1.20 millisecond (best of 93 runs)

📝 Explanation and details

To improve both the runtime performance and memory usage of the provided serialization code, the following optimizations can be applied.

Avoid redundant checks: Remove redundant checks within dispatch functions by organizing and minimizing condition checks.
Optimize dict and list handling: Precompute attributes used multiple times and use more efficient iterations.
Use efficient logging: Replace any runtime debugging logs with appropriate error handling mechanisms.

Here's the optimized code.

Key changes.

Reduced redundant checks in the _serialize_dispatcher.
Simplified string and bytes serialization functions to avoid unnecessary recalculations.
Used more efficient comprehension in _serialize_list_tuple.
Kept detailed logging where necessary but avoided needless logs in production-critical paths.
Removed unnecessary conditions inside _serialize_dispatcher.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 48 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	undefined

🌀 Generated Regression Tests Details

from datetime import datetime
from decimal import Decimal
from typing import Any
from uuid import UUID

import numpy as np
import pandas as pd
# imports
import pytest  # used for our unit tests
from langflow.serialization.serialization import _serialize_dispatcher
from pydantic import BaseModel


# function to test
class _UnserializableSentinel:
    def __repr__(self):
        return "[Unserializable Object]"

UNSERIALIZABLE_SENTINEL = _UnserializableSentinel()
from langflow.serialization.serialization import _serialize_dispatcher

# unit tests

# Test cases for primitive types
def test_serialize_none():
    codeflash_output = _serialize_dispatcher(None, None, None)

def test_serialize_int():
    codeflash_output = _serialize_dispatcher(42, None, None)

def test_serialize_float():
    codeflash_output = _serialize_dispatcher(3.14, None, None)

def test_serialize_bool():
    codeflash_output = _serialize_dispatcher(True, None, None)
    codeflash_output = _serialize_dispatcher(False, None, None)

def test_serialize_complex():
    codeflash_output = _serialize_dispatcher(1 + 2j, None, None)

# Test cases for strings
def test_serialize_str():
    codeflash_output = _serialize_dispatcher("Hello, World!", None, None)
    codeflash_output = _serialize_dispatcher("", None, None)
    long_str = "a" * (MAX_TEXT_LENGTH + 10)
    codeflash_output = _serialize_dispatcher(long_str, MAX_TEXT_LENGTH, None)

# Test cases for bytes
def test_serialize_bytes():
    codeflash_output = _serialize_dispatcher(b"Hello, World!", None, None)
    codeflash_output = _serialize_dispatcher(b"", None, None)
    long_bytes = b"a" * (MAX_TEXT_LENGTH + 10)
    codeflash_output = _serialize_dispatcher(long_bytes, MAX_TEXT_LENGTH, None)

# Test cases for datetime
def test_serialize_datetime():
    dt = datetime(2020, 1, 1, 12, 0, 0)
    codeflash_output = _serialize_dispatcher(dt, None, None)

# Test cases for decimal
def test_serialize_decimal():
    dec = Decimal("3.14")
    codeflash_output = _serialize_dispatcher(dec, None, None)

# Test cases for UUID
def test_serialize_uuid():
    uuid = UUID("12345678123456781234567812345678")
    codeflash_output = _serialize_dispatcher(uuid, None, None)

# Test cases for dictionaries
def test_serialize_dict():
    d = {"key": "value"}
    codeflash_output = _serialize_dispatcher(d, None, None)
    nested_dict = {"key": {"subkey": "subvalue"}}
    codeflash_output = _serialize_dispatcher(nested_dict, None, None)

# Test cases for lists and tuples
def test_serialize_list_tuple():
    lst = [1, 2, 3]
    codeflash_output = _serialize_dispatcher(lst, None, None)
    tpl = (1, 2, 3)
    codeflash_output = _serialize_dispatcher(tpl, None, None)
    nested_list = [[1, 2], [3, 4]]
    codeflash_output = _serialize_dispatcher(nested_list, None, None)

# Test cases for pandas DataFrame
def test_serialize_dataframe():
    df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
    codeflash_output = _serialize_dispatcher(df, None, None)

# Test cases for pandas Series
def test_serialize_series():
    series = pd.Series([1, 2, 3])
    codeflash_output = _serialize_dispatcher(series, None, None)

# Test cases for numpy types




from collections.abc import AsyncIterator, Generator, Iterator
from datetime import datetime, timezone
from decimal import Decimal
from typing import Any, cast
from uuid import UUID

# function to test
import numpy as np
import pandas as pd
# imports
import pytest  # used for our unit tests
from langchain_core.documents import Document
from langflow.serialization.constants import MAX_ITEMS_LENGTH, MAX_TEXT_LENGTH
from langflow.serialization.serialization import _serialize_dispatcher
from loguru import logger
from pydantic import BaseModel
from pydantic.v1 import BaseModel as BaseModelV1


class _UnserializableSentinel:
    def __repr__(self):
        return "[Unserializable Object]"

UNSERIALIZABLE_SENTINEL = _UnserializableSentinel()
from langflow.serialization.serialization import _serialize_dispatcher

# unit tests

@pytest.mark.parametrize("input_obj,expected_output", [
    (None, None),  # Test None
    (42, 42),  # Test integer
    (3.14, 3.14),  # Test float
    (True, True),  # Test boolean
    (1 + 2j, 1 + 2j),  # Test complex number
    ("Hello, World!", "Hello, World!"),  # Test string
    ("a" * 1000, "a" * 1000),  # Test long string without truncation
    ("a" * 1000, "a" * 50 + "..."),  # Test long string with truncation
    (b"Hello, World!", "Hello, World!"),  # Test bytes
    (datetime(2023, 10, 1, 12, 0, 0), "2023-10-01T12:00:00+00:00"),  # Test datetime
    (Decimal("10.5"), 10.5),  # Test Decimal
    (UUID("12345678123456781234567812345678"), "12345678-1234-5678-1234-567812345678"),  # Test UUID
    ([1, 2, 3], [1, 2, 3]),  # Test list
    ((1, 2, 3), [1, 2, 3]),  # Test tuple
    ({"key": "value"}, {"key": "value"}),  # Test dictionary
    (np.array([1, 2, 3]), [1, 2, 3]),  # Test numpy array
    (np.int32(42), 42),  # Test numpy scalar
])
def test_serialize_dispatcher(input_obj, expected_output):
    codeflash_output = _serialize_dispatcher(input_obj, max_length=50, max_items=10)

# Custom classes for testing
class CustomClass:
    def __str__(self):
        return "CustomClass"

class PydanticModel(BaseModel):
    field: str

class PydanticModelV1(BaseModelV1):
    field: str

# Additional tests for custom classes, Pydantic models, and edge cases
@pytest.mark.parametrize("input_obj,expected_output", [
    (CustomClass(), "CustomClass"),  # Test custom class instance
    (PydanticModel(field="value"), {"field": "value"}),  # Test Pydantic model
    (PydanticModelV1(field="value"), {"field": "value"}),  # Test Pydantic v1 model
    ([], []),  # Test empty list
    ((), []),  # Test empty tuple
    ({}, {}),  # Test empty dictionary
    ([1, "two", 3.0, None], [1, "two", 3.0, None]),  # Test list with mixed types
    ({"key1": 1, "key2": "two", "key3": 3.0}, {"key1": 1, "key2": "two", "key3": 3.0}),  # Test dictionary with mixed types
])
def test_serialize_dispatcher_additional(input_obj, expected_output):
    codeflash_output = _serialize_dispatcher(input_obj, max_length=50, max_items=10)

# Test large scale inputs
def test_serialize_dispatcher_large_scale():
    large_list = list(range(1000))
    expected_output = list(range(10)) + ["... [truncated 990 items]"]
    codeflash_output = _serialize_dispatcher(large_list, max_length=50, max_items=10)

    large_dict = {f"key{i}": i for i in range(1000)}
    expected_output = {f"key{i}": i for i in range(10)}
    codeflash_output = _serialize_dispatcher(large_dict, max_length=50, max_items=10)

    large_df = pd.DataFrame({"col1": range(1000), "col2": range(1000)})
    expected_output = [{"col1": i, "col2": i} for i in range(10)]
    codeflash_output = _serialize_dispatcher(large_df, max_length=50, max_items=10)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

…ygroundPage`) To improve both the runtime performance and memory usage of the provided serialization code, the following optimizations can be applied. 1. **Avoid redundant checks**: Remove redundant checks within dispatch functions by organizing and minimizing condition checks. 2. **Optimize `dict` and `list` handling**: Precompute attributes used multiple times and use more efficient iterations. 3. **Use efficient logging**: Replace any runtime debugging logs with appropriate error handling mechanisms. Here's the optimized code. Key changes. 1. Reduced redundant checks in the `_serialize_dispatcher`. 2. Simplified string and bytes serialization functions to avoid unnecessary recalculations. 3. Used more efficient comprehension in `_serialize_list_tuple`. 4. Kept detailed logging where necessary but avoided needless logs in production-critical paths. 5. Removed unnecessary conditions inside `_serialize_dispatcher`.

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Feb 6, 2025

codeflash-ai bot mentioned this pull request Feb 6, 2025

feat: configure and update PlaygroundPage #6028

Draft

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `_serialize_dispatcher` by 22% in PR #6028 (`PlaygroundPage`) #6165

⚡️ Speed up function `_serialize_dispatcher` by 22% in PR #6028 (`PlaygroundPage`) #6165

Uh oh!

codeflash-ai bot commented Feb 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function _serialize_dispatcher by 22% in PR #6028 (PlaygroundPage) #6165

Are you sure you want to change the base?

⚡️ Speed up function _serialize_dispatcher by 22% in PR #6028 (PlaygroundPage) #6165

Uh oh!

Conversation

codeflash-ai bot commented Feb 6, 2025

⚡️ This pull request contains optimizations for PR #6028

📄 22% (0.22x) speedup for _serialize_dispatcher in src/backend/base/langflow/serialization/serialization.py

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function `_serialize_dispatcher` by 22% in PR #6028 (`PlaygroundPage`) #6165

⚡️ Speed up function `_serialize_dispatcher` by 22% in PR #6028 (`PlaygroundPage`) #6165

📄 22% (0.22x) speedup for `_serialize_dispatcher` in `src/backend/base/langflow/serialization/serialization.py`