⚡️ Speed up function expand_compact_flow by 120% in PR #10785 (feat/no-code-pre-built-component)
#10788
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #10785
If you approve this dependent PR, these changes will be merged into the original PR branch
feat/no-code-pre-built-component.📄 120% (1.20x) speedup for
expand_compact_flowinsrc/backend/base/langflow/processing/expand_flow.py⏱️ Runtime :
21.0 milliseconds→9.54 milliseconds(best of172runs)📝 Explanation and details
The optimized code achieves a 119% speedup through several key performance optimizations that reduce memory allocation and improve algorithmic efficiency:
Primary Performance Gains:
Eliminated expensive
deepcopy()in_expand_node(95.9% of original runtime): Replaced the costly deep copy of entire component templates with selective shallow copying. Only thetemplatedict is copied when needed, and nested dicts are copied only when mutated. This dramatic reduction from 169ms to ~10ms in the profiler represents the largest performance gain.Optimized
_get_flat_components: Replaced dict comprehension with imperativedict.update()calls, reducing intermediate object creation and repeated.items()calls. Runtime dropped from 307μs to 153μs.Improved loop efficiency in
_expand_edge: Replacednext()with generator expression with an explicitforloop andbreak, avoiding generator overhead for small output lists.Used
type(x) is dictinstead ofisinstance(x, dict): Micro-optimization for built-in type checking that's slightly faster when dealing with known built-in types.Replaced loops with comprehensions: Used dict/list comprehensions in
expand_compact_flowfor node and edge expansion, providing better memory locality and reduced function call overhead.Why These Optimizations Work:
deepcopy()was recursively copying entire component templates when only thetemplatefield needed modification. The optimized version reduces memory churn by ~90%.Impact on Test Cases:
The optimizations are particularly effective for the large-scale test cases (
test_large_scale_performancewith 500 nodes,test_large_number_of_nodes_and_edgeswith 200 nodes) where the cumulative effect of avoiding expensive deep copies becomes significant. Basic test cases also benefit proportionally, making this optimization universally beneficial across different flow sizes.The optimization preserves all functionality while dramatically improving performance for flow expansion operations, which appear to be in a performance-critical path based on the comprehensive test coverage.
✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-pr10785-2025-11-28T20.34.28and push.