⚡️ Speed up function _create_email_model by 19% in PR #10702 (pluggable-auth-service)
#10771
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #10702
If you approve this dependent PR, these changes will be merged into the original PR branch
pluggable-auth-service.📄 19% (0.19x) speedup for
_create_email_modelinsrc/backend/base/langflow/utils/registered_email_util.py⏱️ Runtime :
144 milliseconds→121 milliseconds(best of36runs)📝 Explanation and details
The optimization achieves a 19% speedup by adding a lightweight pre-validation check that short-circuits expensive
EmailPayloadconstruction for obviously invalid emails.Key optimizations:
Early '@' validation: Added
if '@' not in email:check before callingEmailPayload(email=email). Since emails must contain exactly one '@' symbol, this trivial string search eliminates the need for expensive validation on clearly invalid inputs.Improved empty string check: Changed
not isinstance(email, str) or (len(email) == 0)tonot (isinstance(email, str) and email), which leverages Python's truthiness evaluation and avoids thelen()call.Eliminated unnecessary variable assignment: Removed the intermediate
email_modelvariable and returnEmailPayloaddirectly when validation passes.Why this works: The profiler shows
EmailPayload(email=email)consumes 96.5% of execution time in the original code. By catching 1,503 out of 3,066 invalid emails with the fast '@' check (as shown in the optimized profiler), we avoid expensive validation for ~49% of inputs. The '@' check takes only 0.1% of total time but prevents costlyEmailPayloadconstruction attempts.Test case benefits: This optimization particularly helps with:
test_bulk_mixed_emails,test_bulk_invalid_emails)"plainaddress","user@","@example.com"The optimization maintains identical behavior and error messages while providing substantial performance gains on workloads with many invalid email inputs.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
import pytest
from langflow.utils.registered_email_util import _create_email_model
Dummy EmailPayload class for testing
class EmailPayload:
def init(self, email):
# Very basic email validation for test purposes
if not isinstance(email, str) or "@" not in email or email.startswith("@") or email.endswith("@") or ".." in email:
raise ValueError("Invalid email address")
if email.count("@") != 1:
raise ValueError("Invalid email address")
local, domain = email.split("@")
if not local or not domain or "." not in domain or domain.startswith(".") or domain.endswith("."):
raise ValueError("Invalid email address")
self.email = email
Dummy logger for testing
class DummyLogger:
def init(self):
self.errors = []
logger = DummyLogger()
from langflow.utils.registered_email_util import _create_email_model
unit tests
--- Basic Test Cases ---
def test_valid_email_simple():
# Test with a standard valid email
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_valid_email_with_dot():
# Test with a valid email containing dots in the local part
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_valid_email_with_subdomain():
# Test with a valid email containing a subdomain
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_valid_email_with_plus():
# Test with a valid email using a plus sign
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_valid_email_with_numbers():
# Test with a valid email containing numbers
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
--- Edge Test Cases ---
@pytest.mark.parametrize("invalid_email", [
"", # Empty string
None, # NoneType
12345, # Integer
[], # List
{}, # Dict
"plainaddress", # Missing @
"user@", # Missing domain
"@example.com", # Missing local part
"user@@example.com", # Double @
"[email protected]", # Domain starts with dot
"user@com.", # Domain ends with dot
"[email protected]", # Double dot in domain
"[email protected]", # Double dot in local part
"user@example", # Domain without dot
"[email protected]", # Domain starts with dot
"[email protected]", # Domain starts with invalid character
"[email protected].", # Domain ends with dot
"[email protected].", # Domain starts and ends with dot
"[email protected].", # Domain starts and ends with dot
])
def test_invalid_emails_return_none(invalid_email):
# Test a variety of invalid emails
codeflash_output = _create_email_model(invalid_email); result = codeflash_output
def test_email_with_spaces():
# Test with spaces in the email
email = "user [email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_leading_trailing_spaces():
# Test with leading/trailing spaces (should be invalid)
email = " [email protected] "
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_unicode():
# Test with unicode characters (should be invalid for this validator)
email = "usé[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_special_chars():
# Test with special characters not allowed in local part
email = "user!#$%&'*+/=?^_`{|}[email protected]"
# Our dummy validator does not support these, so should return None
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_multiple_at_symbols():
# Test with multiple @ symbols
email = "user@@example.com"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_only_at():
# Test with only @ symbol
email = "@"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_as_boolean():
# Test with boolean value
email = True
codeflash_output = _create_email_model(email); result = codeflash_output
--- Large Scale Test Cases ---
def test_email_with_long_local_and_domain():
# Test with long local and domain parts (but still valid)
local = "a" * 64
domain = "b" * 63 + ".com"
email = f"{local}@{domain}"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_max_length():
# RFC says max email length is 254 chars
local = "a" * 64
domain = "b" * (254 - len(local) - 1) # -1 for @
email = f"{local}@{domain}"
# Our validator will fail this because domain needs a dot
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_numeric_tld():
# Test with numeric TLD (should be valid for our dummy validator)
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_uppercase():
# Test with uppercase letters
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_mixed_case():
# Test with mixed case
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_hyphen_in_domain():
# Test with hyphen in domain
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_underscore_in_local():
# Test with underscore in local part
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_trailing_newline():
# Test with trailing newline character
email = "[email protected]\n"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_tab_character():
# Test with tab character in email
email = "user\[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest # used for our unit tests
from langflow.utils.registered_email_util import _create_email_model
Simulate EmailPayload for testing purposes.
In real code, this would be imported from langflow.services.telemetry.schema
class EmailPayload:
def init(self, email: str):
# Very basic email validation: must contain '@' and at least one '.' after '@'
if not isinstance(email, str):
raise ValueError("Email must be a string.")
if '@' not in email or email.count('@') != 1:
raise ValueError("Email must contain a single '@'.")
local, domain = email.split('@')
if not local or not domain or '.' not in domain or domain.startswith('.') or domain.endswith('.'):
raise ValueError("Email domain must contain a '.' and not start/end with '.'.")
if any(c in email for c in ' ,;'):
raise ValueError("Email contains invalid characters.")
self.email = email
Dummy logger for testing
class DummyLogger:
def error(self, msg):
pass # Ignore logging in tests
logger = DummyLogger()
from langflow.utils.registered_email_util import _create_email_model
unit tests
--- Basic Test Cases ---
def test_valid_email_basic():
# Test with a standard valid email
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_valid_email_with_dot_in_local():
# Test with a dot in the local part
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_valid_email_with_subdomain():
# Test with subdomain in domain part
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_valid_email_with_numbers():
# Test with numbers in local and domain
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
--- Edge Test Cases ---
def test_empty_string():
# Test with empty string
email = ""
codeflash_output = _create_email_model(email); result = codeflash_output
def test_none_input():
# Test with None as input
email = None
codeflash_output = _create_email_model(email); result = codeflash_output
@pytest.mark.parametrize("bad_type", [123, 45.6, [], {}, True, False])
def test_non_string_types(bad_type):
# Test with non-string types
codeflash_output = _create_email_model(bad_type); result = codeflash_output
@pytest.mark.parametrize("invalid_email", [
"plainaddress", # No @
"user@", # No domain
"@example.com", # No local part
"[email protected]", # Domain starts with dot
"user@com.", # Domain ends with dot
"user@com", # No dot in domain
"user@@example.com", # Double @
"user@ example.com", # Space in domain
"user@exam ple.com", # Space in domain
"user@exam,ple.com", # Comma in domain
"user@exam;ple.com", # Semicolon in domain
"user [email protected]", # Space in local part
"[email protected]", # Domain starts with dot
])
def test_invalid_email_formats(invalid_email):
# Test various invalid email formats
codeflash_output = _create_email_model(invalid_email); result = codeflash_output
def test_email_with_leading_trailing_spaces():
# Test email with leading/trailing spaces
email = " [email protected] "
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_special_characters():
# Test email with invalid special characters
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_multiple_dots_in_domain():
# Test email with multiple dots in domain (valid case)
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_dot_at_start_of_domain():
# Test email with dot at start of domain (invalid)
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_dot_at_end_of_domain():
# Test email with dot at end of domain (invalid)
email = "[email protected]."
codeflash_output = _create_email_model(email); result = codeflash_output
--- Large Scale Test Cases ---
def test_bulk_valid_emails():
# Test with a large list of valid emails
emails = [f"user{i}@example{i}.com" for i in range(1000)]
for email in emails:
codeflash_output = _create_email_model(email); result = codeflash_output
def test_bulk_invalid_emails():
# Test with a large list of invalid emails (missing '@')
emails = [f"user{i}example{i}.com" for i in range(1000)]
for email in emails:
codeflash_output = _create_email_model(email); result = codeflash_output
def test_bulk_mixed_emails():
# Test with a mix of valid and invalid emails
valid_emails = [f"user{i}@example{i}.com" for i in range(500)]
invalid_emails = [f"user{i}example{i}.com" for i in range(500)]
all_emails = valid_emails + invalid_emails
for i, email in enumerate(all_emails):
codeflash_output = _create_email_model(email); result = codeflash_output
if i < 500:
pass
else:
pass
def test_long_local_and_domain_parts():
# Test with very long local and domain parts (but within reasonable length)
local = "a" * 100
domain = "b" * 100 + ".com"
email = f"{local}@{domain}"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_long_email_with_invalid_format():
# Test with long email but invalid format (missing '@')
email = "a" * 200 + ".com"
codeflash_output = _create_email_model(email); result = codeflash_output
--- Edge: Unicode and Internationalized Emails ---
def test_email_with_unicode_characters():
# Unicode in local part (allowed in some standards, but our validation rejects)
email = "usé[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
def test_email_with_unicode_domain():
# Unicode in domain part (should be rejected by our validation)
email = "user@exámple.com"
codeflash_output = _create_email_model(email); result = codeflash_output
--- Edge: Minimal Valid Email ---
def test_minimal_valid_email():
# Minimal valid email (single char local, single char domain, with dot)
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
--- Edge: Maximal Valid Email ---
def test_maximal_valid_email():
# Maximal valid email (within reasonable length)
local = "a" * 64
domain = "b" * 63 + ".com"
email = f"{local}@{domain}"
codeflash_output = _create_email_model(email); result = codeflash_output
--- Edge: Email with consecutive dots in domain (invalid) ---
def test_email_with_consecutive_dots_in_domain():
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
--- Edge: Email with consecutive dots in local (valid in RFC, but our validation allows) ---
def test_email_with_consecutive_dots_in_local():
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
--- Edge: Email with hyphen in domain (valid) ---
def test_email_with_hyphen_in_domain():
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
--- Edge: Email with underscore in local (valid) ---
def test_email_with_underscore_in_local():
email = "[email protected]"
codeflash_output = _create_email_model(email); result = codeflash_output
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
To edit these changes
git checkout codeflash/optimize-pr10702-2025-11-28T02.15.55and push.