Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Nov 28, 2025

⚡️ This pull request contains optimizations for PR #10702

If you approve this dependent PR, these changes will be merged into the original PR branch pluggable-auth-service.

This PR will be automatically closed if the original PR is merged.


📄 19% (0.19x) speedup for _create_email_model in src/backend/base/langflow/utils/registered_email_util.py

⏱️ Runtime : 144 milliseconds 121 milliseconds (best of 36 runs)

📝 Explanation and details

The optimization achieves a 19% speedup by adding a lightweight pre-validation check that short-circuits expensive EmailPayload construction for obviously invalid emails.

Key optimizations:

  1. Early '@' validation: Added if '@' not in email: check before calling EmailPayload(email=email). Since emails must contain exactly one '@' symbol, this trivial string search eliminates the need for expensive validation on clearly invalid inputs.

  2. Improved empty string check: Changed not isinstance(email, str) or (len(email) == 0) to not (isinstance(email, str) and email), which leverages Python's truthiness evaluation and avoids the len() call.

  3. Eliminated unnecessary variable assignment: Removed the intermediate email_model variable and return EmailPayload directly when validation passes.

Why this works: The profiler shows EmailPayload(email=email) consumes 96.5% of execution time in the original code. By catching 1,503 out of 3,066 invalid emails with the fast '@' check (as shown in the optimized profiler), we avoid expensive validation for ~49% of inputs. The '@' check takes only 0.1% of total time but prevents costly EmailPayload construction attempts.

Test case benefits: This optimization particularly helps with:

  • Bulk processing of mixed valid/invalid emails (test_bulk_mixed_emails, test_bulk_invalid_emails)
  • Obviously malformed inputs like "plainaddress", "user@", "@example.com"
  • Non-string inputs and edge cases that fail the initial type check

The optimization maintains identical behavior and error messages while providing substantial performance gains on workloads with many invalid email inputs.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 3080 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime

import pytest
from langflow.utils.registered_email_util import _create_email_model

Dummy EmailPayload class for testing

class EmailPayload:
def init(self, email):
# Very basic email validation for test purposes
if not isinstance(email, str) or "@" not in email or email.startswith("@") or email.endswith("@") or ".." in email:
raise ValueError("Invalid email address")
if email.count("@") != 1:
raise ValueError("Invalid email address")
local, domain = email.split("@")
if not local or not domain or "." not in domain or domain.startswith(".") or domain.endswith("."):
raise ValueError("Invalid email address")
self.email = email

def __eq__(self, other):
    # For testing equality in asserts
    return isinstance(other, EmailPayload) and self.email == other.email

def __repr__(self):
    return f"EmailPayload(email={self.email!r})"

Dummy logger for testing

class DummyLogger:
def init(self):
self.errors = []

def error(self, msg):
    self.errors.append(msg)

logger = DummyLogger()
from langflow.utils.registered_email_util import _create_email_model

unit tests

--- Basic Test Cases ---

def test_valid_email_simple():
# Test with a standard valid email
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_valid_email_with_dot():
# Test with a valid email containing dots in the local part
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_valid_email_with_subdomain():
# Test with a valid email containing a subdomain
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_valid_email_with_plus():
# Test with a valid email using a plus sign
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_valid_email_with_numbers():
# Test with a valid email containing numbers
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

--- Edge Test Cases ---

@pytest.mark.parametrize("invalid_email", [
"", # Empty string
None, # NoneType
12345, # Integer
[], # List
{}, # Dict
"plainaddress", # Missing @
"user@", # Missing domain
"@example.com", # Missing local part
"user@@example.com", # Double @
"[email protected]", # Domain starts with dot
"user@com.", # Domain ends with dot
"[email protected]", # Double dot in domain
"[email protected]", # Double dot in local part
"user@example", # Domain without dot
"[email protected]", # Domain starts with dot
"[email protected]", # Domain starts with invalid character
"[email protected].", # Domain ends with dot
"[email protected].", # Domain starts and ends with dot
"[email protected].", # Domain starts and ends with dot
])
def test_invalid_emails_return_none(invalid_email):
# Test a variety of invalid emails
codeflash_output = _create_email_model(invalid_email); result = codeflash_output

def test_email_with_spaces():
# Test with spaces in the email
email = "user [email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_leading_trailing_spaces():
# Test with leading/trailing spaces (should be invalid)
email = " [email protected] "
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_unicode():
# Test with unicode characters (should be invalid for this validator)
email = "usé[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_special_chars():
# Test with special characters not allowed in local part
email = "user!#$%&'*+/=?^_`{|}[email protected]"
# Our dummy validator does not support these, so should return None
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_multiple_at_symbols():
# Test with multiple @ symbols
email = "user@@example.com"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_only_at():
# Test with only @ symbol
email = "@"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_as_boolean():
# Test with boolean value
email = True
codeflash_output = _create_email_model(email); result = codeflash_output

--- Large Scale Test Cases ---

def test_email_with_long_local_and_domain():
# Test with long local and domain parts (but still valid)
local = "a" * 64
domain = "b" * 63 + ".com"
email = f"{local}@{domain}"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_max_length():
# RFC says max email length is 254 chars
local = "a" * 64
domain = "b" * (254 - len(local) - 1) # -1 for @
email = f"{local}@{domain}"
# Our validator will fail this because domain needs a dot
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_numeric_tld():
# Test with numeric TLD (should be valid for our dummy validator)
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_uppercase():
# Test with uppercase letters
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_mixed_case():
# Test with mixed case
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_hyphen_in_domain():
# Test with hyphen in domain
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_underscore_in_local():
# Test with underscore in local part
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_trailing_newline():
# Test with trailing newline character
email = "[email protected]\n"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_tab_character():
# Test with tab character in email
email = "user\[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

#------------------------------------------------
import pytest # used for our unit tests
from langflow.utils.registered_email_util import _create_email_model

Simulate EmailPayload for testing purposes.

In real code, this would be imported from langflow.services.telemetry.schema

class EmailPayload:
def init(self, email: str):
# Very basic email validation: must contain '@' and at least one '.' after '@'
if not isinstance(email, str):
raise ValueError("Email must be a string.")
if '@' not in email or email.count('@') != 1:
raise ValueError("Email must contain a single '@'.")
local, domain = email.split('@')
if not local or not domain or '.' not in domain or domain.startswith('.') or domain.endswith('.'):
raise ValueError("Email domain must contain a '.' and not start/end with '.'.")
if any(c in email for c in ' ,;'):
raise ValueError("Email contains invalid characters.")
self.email = email

def __eq__(self, other):
    # For test assertions
    return isinstance(other, EmailPayload) and self.email == other.email

def __repr__(self):
    return f"EmailPayload(email={self.email!r})"

Dummy logger for testing

class DummyLogger:
def error(self, msg):
pass # Ignore logging in tests

logger = DummyLogger()
from langflow.utils.registered_email_util import _create_email_model

unit tests

--- Basic Test Cases ---

def test_valid_email_basic():
# Test with a standard valid email
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_valid_email_with_dot_in_local():
# Test with a dot in the local part
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_valid_email_with_subdomain():
# Test with subdomain in domain part
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_valid_email_with_numbers():
# Test with numbers in local and domain
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

--- Edge Test Cases ---

def test_empty_string():
# Test with empty string
email = ""
codeflash_output = _create_email_model(email); result = codeflash_output

def test_none_input():
# Test with None as input
email = None
codeflash_output = _create_email_model(email); result = codeflash_output

@pytest.mark.parametrize("bad_type", [123, 45.6, [], {}, True, False])
def test_non_string_types(bad_type):
# Test with non-string types
codeflash_output = _create_email_model(bad_type); result = codeflash_output

@pytest.mark.parametrize("invalid_email", [
"plainaddress", # No @
"user@", # No domain
"@example.com", # No local part
"[email protected]", # Domain starts with dot
"user@com.", # Domain ends with dot
"user@com", # No dot in domain
"user@@example.com", # Double @
"user@ example.com", # Space in domain
"user@exam ple.com", # Space in domain
"user@exam,ple.com", # Comma in domain
"user@exam;ple.com", # Semicolon in domain
"user [email protected]", # Space in local part
"[email protected]", # Domain starts with dot
])
def test_invalid_email_formats(invalid_email):
# Test various invalid email formats
codeflash_output = _create_email_model(invalid_email); result = codeflash_output

def test_email_with_leading_trailing_spaces():
# Test email with leading/trailing spaces
email = " [email protected] "
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_special_characters():
# Test email with invalid special characters
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_multiple_dots_in_domain():
# Test email with multiple dots in domain (valid case)
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_dot_at_start_of_domain():
# Test email with dot at start of domain (invalid)
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_dot_at_end_of_domain():
# Test email with dot at end of domain (invalid)
email = "[email protected]."
codeflash_output = _create_email_model(email); result = codeflash_output

--- Large Scale Test Cases ---

def test_bulk_valid_emails():
# Test with a large list of valid emails
emails = [f"user{i}@example{i}.com" for i in range(1000)]
for email in emails:
codeflash_output = _create_email_model(email); result = codeflash_output

def test_bulk_invalid_emails():
# Test with a large list of invalid emails (missing '@')
emails = [f"user{i}example{i}.com" for i in range(1000)]
for email in emails:
codeflash_output = _create_email_model(email); result = codeflash_output

def test_bulk_mixed_emails():
# Test with a mix of valid and invalid emails
valid_emails = [f"user{i}@example{i}.com" for i in range(500)]
invalid_emails = [f"user{i}example{i}.com" for i in range(500)]
all_emails = valid_emails + invalid_emails
for i, email in enumerate(all_emails):
codeflash_output = _create_email_model(email); result = codeflash_output
if i < 500:
pass
else:
pass

def test_long_local_and_domain_parts():
# Test with very long local and domain parts (but within reasonable length)
local = "a" * 100
domain = "b" * 100 + ".com"
email = f"{local}@{domain}"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_long_email_with_invalid_format():
# Test with long email but invalid format (missing '@')
email = "a" * 200 + ".com"
codeflash_output = _create_email_model(email); result = codeflash_output

--- Edge: Unicode and Internationalized Emails ---

def test_email_with_unicode_characters():
# Unicode in local part (allowed in some standards, but our validation rejects)
email = "usé[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

def test_email_with_unicode_domain():
# Unicode in domain part (should be rejected by our validation)
email = "user@exámple.com"
codeflash_output = _create_email_model(email); result = codeflash_output

--- Edge: Minimal Valid Email ---

def test_minimal_valid_email():
# Minimal valid email (single char local, single char domain, with dot)
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

--- Edge: Maximal Valid Email ---

def test_maximal_valid_email():
# Maximal valid email (within reasonable length)
local = "a" * 64
domain = "b" * 63 + ".com"
email = f"{local}@{domain}"
codeflash_output = _create_email_model(email); result = codeflash_output

--- Edge: Email with consecutive dots in domain (invalid) ---

def test_email_with_consecutive_dots_in_domain():
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

--- Edge: Email with consecutive dots in local (valid in RFC, but our validation allows) ---

def test_email_with_consecutive_dots_in_local():
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

--- Edge: Email with hyphen in domain (valid) ---

def test_email_with_hyphen_in_domain():
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

--- Edge: Email with underscore in local (valid) ---

def test_email_with_underscore_in_local():
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr10702-2025-11-28T02.15.55 and push.

Codeflash

The optimization achieves a **19% speedup** by adding a lightweight pre-validation check that short-circuits expensive `EmailPayload` construction for obviously invalid emails.

**Key optimizations:**

1. **Early '@' validation**: Added `if '@' not in email:` check before calling `EmailPayload(email=email)`. Since emails must contain exactly one '@' symbol, this trivial string search eliminates the need for expensive validation on clearly invalid inputs.

2. **Improved empty string check**: Changed `not isinstance(email, str) or (len(email) == 0)` to `not (isinstance(email, str) and email)`, which leverages Python's truthiness evaluation and avoids the `len()` call.

3. **Eliminated unnecessary variable assignment**: Removed the intermediate `email_model` variable and return `EmailPayload` directly when validation passes.

**Why this works:** The profiler shows `EmailPayload(email=email)` consumes 96.5% of execution time in the original code. By catching 1,503 out of 3,066 invalid emails with the fast '@' check (as shown in the optimized profiler), we avoid expensive validation for ~49% of inputs. The '@' check takes only 0.1% of total time but prevents costly `EmailPayload` construction attempts.

**Test case benefits:** This optimization particularly helps with:
- Bulk processing of mixed valid/invalid emails (`test_bulk_mixed_emails`, `test_bulk_invalid_emails`)
- Obviously malformed inputs like `"plainaddress"`, `"user@"`, `"@example.com"` 
- Non-string inputs and edge cases that fail the initial type check

The optimization maintains identical behavior and error messages while providing substantial performance gains on workloads with many invalid email inputs.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Nov 28, 2025
@github-actions github-actions bot added the community Pull Request from an external contributor label Nov 28, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 28, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

Frontend Unit Test Coverage Report

Coverage Summary

Lines Statements Branches Functions
Coverage: 15%
15.24% (4188/27479) 8.46% (1778/20993) 9.57% (579/6049)

Unit Test Results

Tests Skipped Failures Errors Time
1638 0 💤 0 ❌ 0 🔥 20.167s ⏱️

@codecov
Copy link

codecov bot commented Nov 28, 2025

Codecov Report

❌ Patch coverage is 60.00000% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (pluggable-auth-service@3418a59). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...ckend/base/langflow/utils/registered_email_util.py 60.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@                    Coverage Diff                    @@
##             pluggable-auth-service   #10771   +/-   ##
=========================================================
  Coverage                          ?   31.55%           
=========================================================
  Files                             ?     1369           
  Lines                             ?    63526           
  Branches                          ?     9373           
=========================================================
  Hits                              ?    20048           
  Misses                            ?    42446           
  Partials                          ?     1032           
Flag Coverage Δ
backend 47.92% <60.00%> (?)
frontend 14.08% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ckend/base/langflow/utils/registered_email_util.py 84.90% <60.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI community Pull Request from an external contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant