Skip to content

Conversation

2SSK
Copy link

@2SSK 2SSK commented Sep 17, 2025

…ctions

Remove complex session reuse logic, cleanup tasks, and reference counting from MCPSessionManager. Move
create_input_schema_from_json_schema function into util.py

Summary by CodeRabbit

  • New Features

    • Dynamic tool input generation from JSON Schema for more accurate, descriptive forms.
    • More reliable sessions with per-context tracking and enhanced health checks.
    • Flow lookup by sanitized name for easier discovery.
  • Refactor

    • Simplified URL validation and redirect checks; reduced strict header requirements for broader compatibility.
    • Streamlined logging to standard levels with targeted debug messages for connectivity and health.
  • Style

    • Consistent logging style across session and connectivity operations.

…ctions

Remove complex session reuse logic, cleanup tasks, and reference
counting from MCPSessionManager. Move
create_input_schema_from_json_schema function into util.py
Copy link
Contributor

coderabbitai bot commented Sep 17, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Refactors MCP utilities and session/client APIs within src/lfx/src/lfx/base/mcp/util.py: introduces dynamic JSON-schema→Pydantic model creation, adds Flow lookup by sanitized name, converts session management to async per-context sessions with health checks and cleanup, simplifies header handling, and updates client URL validation/redirect pre-check signatures.

Changes

Cohort / File(s) Summary
MCP utilities and session/client APIs
src/lfx/src/lfx/base/mcp/util.py
Refactor header processing; add NULLABLE_TYPE_LENGTH; add create_input_schema_from_json_schema; update get_flow_snake_case signature; convert MCPSessionManager.get_session and creation/cleanup helpers to async per-context; replace _cleanup_session_by_id with _cleanup_session; track sessions per context with server-switch detection and health checks; update MCPStdioClient/MCPSseClient validate_url and pre_check_redirect signatures; adjust logging calls; integrate schema-driven tool arg models in update_tools.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant Manager as MCPSessionManager
  participant Client as MCP Client (Stdio/SSE)
  participant Server

  Caller->>Manager: get_session(context_id, params, transport_type)
  alt existing session for context_id
    Manager->>Manager: check health (task state, write-stream, connectivity)
    alt unhealthy or server switched
      Manager->>Manager: _cleanup_session(context_id)
      Manager->>Client: create (async) via _create_stdio/_create_sse
      Client->>Server: connect
      Server-->>Client: connected/ready
      Client-->>Manager: session handle
    else healthy
      Manager-->>Caller: existing session handle
    end
  else no session
    Manager->>Client: create (async) via _create_stdio/_create_sse
    Client->>Server: connect
    Server-->>Client: connected/ready
    Client-->>Manager: session handle
    Manager-->>Caller: new session handle
  end
Loading
sequenceDiagram
  autonumber
  participant Tooling as Tooling (update_tools)
  participant Util as util.create_input_schema_from_json_schema
  Note over Tooling,Util: Build Pydantic model from tool inputSchema
  Tooling->>Util: schema dict ($defs, $ref, anyOf, defaults)
  Util-->>Tooling: Pydantic BaseModel subclass
  Tooling->>Tooling: assemble StructuredTool with arg schema
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

refactor, size:XL, lgtm

Suggested reviewers

  • lucaseduoli
  • mfortman11
  • ogabrielluiz
  • phact

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title concisely and accurately summarizes the primary change—simplifying MCP session management and consolidating utility functions—which matches the changeset that refactors session handling and introduces/moves utility functions; it is clear, specific, and appropriate for history scanning.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added refactor Maintenance tasks and housekeeping and removed refactor Maintenance tasks and housekeeping labels Sep 17, 2025
@github-actions github-actions bot added refactor Maintenance tasks and housekeeping and removed refactor Maintenance tasks and housekeeping labels Sep 17, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/lfx/src/lfx/base/mcp/util.py (2)

838-839: Fix invalid isinstance() unions (runtime TypeError).

Using PEP 604 unions inside isinstance causes TypeError at runtime. Replace with tuples.

-                is_timeout_error = isinstance(e, asyncio.TimeoutError | TimeoutError)
+                is_timeout_error = isinstance(e, (asyncio.TimeoutError, TimeoutError))

-                if (
-                    isinstance(e, ConnectionError | TimeoutError | OSError | ValueError)
+                if (
+                    isinstance(e, (ConnectionError, TimeoutError, OSError, ValueError))
                     or is_closed_resource_error
                     or is_mcp_connection_error
                     or is_timeout_error
                 ):

Also applies to: 869-873


1125-1125: Fix invalid isinstance() unions in SSE client too.

Same issue as above; switch to tuples.

-                is_timeout_error = isinstance(e, asyncio.TimeoutError | TimeoutError)
+                is_timeout_error = isinstance(e, (asyncio.TimeoutError, TimeoutError))

-                if (
-                    isinstance(e, ConnectionError | TimeoutError | OSError | ValueError)
+                if (
+                    isinstance(e, (ConnectionError, TimeoutError, OSError, ValueError))
                     or is_closed_resource_error
                     or is_mcp_connection_error
                     or is_timeout_error
                 ):

Also applies to: 1156-1160

🧹 Nitpick comments (9)
src/lfx/src/lfx/base/mcp/util.py (9)

206-217: Guard against cyclic $ref chains.

resolve_ref can loop indefinitely on cyclic refs. Track seen refs and break on cycles.

-    def resolve_ref(s: dict[str, Any] | None) -> dict[str, Any]:
+    def resolve_ref(s: dict[str, Any] | None) -> dict[str, Any]:
         """Follow a $ref chain until you land on a real subschema."""
         if s is None:
             return {}
-        while "$ref" in s:
-            ref_name = s["$ref"].split("/")[-1]
-            s = defs.get(ref_name)
+        seen: set[str] = set()
+        while "$ref" in s:
+            ref_name = s["$ref"].split("/")[-1]
+            if ref_name in seen:
+                logger.warning(f"Cyclic $ref detected at '{ref_name}'")
+                return {"type": "string"}
+            seen.add(ref_name)
+            s = defs.get(ref_name)
             if s is None:
                 logger.warning(f"Parsing input schema: Definition '{ref_name}' not found")
                 return {"type": "string"}
         return s

936-943: Handle None URL early in validate_url.

urlparse(None) raises TypeError; return a friendly error instead.

-    async def validate_url(self, url: str | None) -> tuple[bool, str]:
+    async def validate_url(self, url: str | None) -> tuple[bool, str]:
         """Validate the SSE URL before attempting connection."""
-        try:
+        if not url:
+            return False, "URL is required."
+        try:
             parsed = urlparse(url)

979-987: Broaden redirect handling for SSE pre-check.

Also follow 301/302/308 to surface the final URL.

-                if response.status_code == httpx.codes.TEMPORARY_REDIRECT:
+                if response.status_code in (
+                    httpx.codes.MOVED_PERMANENTLY,      # 301
+                    httpx.codes.FOUND,                  # 302
+                    httpx.codes.TEMPORARY_REDIRECT,     # 307
+                    httpx.codes.PERMANENT_REDIRECT,     # 308
+                ):
                     return response.headers.get("Location", url)

152-159: Avoid run_until_complete in potentially running loops.

This can raise "This event loop is already running" in async environments. Use asyncio.run when no loop; otherwise offload to a worker thread.

-        try:
-            loop = asyncio.get_event_loop()
-            return loop.run_until_complete(client.run_tool(tool_name, arguments=validated.model_dump()))
+        try:
+            try:
+                loop = asyncio.get_running_loop()
+            except RuntimeError:
+                # No running loop
+                return asyncio.run(client.run_tool(tool_name, arguments=validated.model_dump()))
+            else:
+                # Running loop: execute in a separate thread with its own loop
+                import concurrent.futures
+                with concurrent.futures.ThreadPoolExecutor(max_workers=1) as ex:
+                    fut = ex.submit(asyncio.run, client.run_tool(tool_name, arguments=validated.model_dump()))
+                    return fut.result()

55-56: Deduplicate emoji range in regex.

The block U+2702–U+27B0 is listed twice; drop the duplicate.

-        "\U00002702-\U000027b0"
-        "\U00002702-\U000027b0"
+        "\U00002702-\U000027b0"

178-191: Sanitize the input name too for symmetry.

You sanitize stored flow names but not the input; sanitize both to avoid mismatches.

 async def get_flow_snake_case(flow_name: str, user_id: str, session, is_action: bool | None = None) -> Flow | None:
-    uuid_user_id = UUID(user_id) if isinstance(user_id, str) else user_id
+    uuid_user_id = UUID(user_id) if isinstance(user_id, str) else user_id
+    flow_name = sanitize_mcp_name(flow_name)

578-581: Fix shutdown log message.

"Message is shutting down" → "Session is shutting down".

-                            # Session is being shut down
-                            msg = "Message is shutting down"
+                            # Session is being shut down
+                            msg = "Session is shutting down"

Also applies to: 641-644


25-26: Remove unused constant.

HTTP_ERROR_STATUS_CODE is declared but unused.

-HTTP_ERROR_STATUS_CODE = httpx_codes.BAD_REQUEST  # HTTP status code for client errors

511-526: Consider relying solely on connectivity test; private _write_stream is brittle.

Accessing session._write_stream is an internal detail and may break across MCP/anyio versions. The connectivity probe already gates reuse.

Would you like me to gate reuse only on task.done() + connectivity check and drop the private stream inspection?

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 859f8da and 6ac8e01.

📒 Files selected for processing (1)
  • src/lfx/src/lfx/base/mcp/util.py (33 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/lfx/src/lfx/base/mcp/util.py (3)
src/backend/tests/conftest.py (1)
  • flow (544-560)
src/backend/base/langflow/services/database/models/flow/model.py (1)
  • Flow (186-212)
src/backend/base/langflow/services/deps.py (1)
  • get_settings_service (111-124)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Update Starter Projects
  • GitHub Check: Run Ruff Check and Format
  • GitHub Check: Ruff Style Check (3.13)
🔇 Additional comments (2)
src/lfx/src/lfx/base/mcp/util.py (2)

783-786: Validate call sites after async signature change — get_session is now async with (context_id, params, type).
Ensure all callers were updated (no remaining sync calls or old-signature usages).
Locations: src/lfx/src/lfx/base/mcp/util.py:783–786 and 1052–1055.
Verification attempt failed: ripgrep errored with "unrecognized file type: python". Re-run searches:

rg -nP '\.get_session\(' -C2
rg -nP 'validate_url\(' -C2
rg -nP 'pre_check_redirect\(' -C2
# fallback if rg type handling is broken:
grep -RIn --include='*.py' '\.get_session(' .

180-181: No action needed – Flow.is_component is present on the Flow model.
The is_component field is declared in src/backend/base/langflow/services/database/models/flow/model.py (line 42 on FlowBase and line 237 on Flow).

Comment on lines +224 to +249
if "anyOf" in s:
# Handle common pattern for nullable types (anyOf with string and null)
subtypes = [sub.get("type") for sub in s["anyOf"] if isinstance(sub, dict) and "type" in sub]

# Check if this is a simple nullable type (e.g., str | None)
if len(subtypes) == NULLABLE_TYPE_LENGTH and "null" in subtypes:
# Get the non-null type
non_null_type = next(t for t in subtypes if t != "null")
# Map it to Python type
if isinstance(non_null_type, str):
return {
"string": str,
"integer": int,
"number": float,
"boolean": bool,
"object": dict,
"array": list,
}.get(non_null_type, Any)
return Any

# For other anyOf cases, use the first non-null type
subtypes = [parse_type(sub) for sub in s["anyOf"]]
non_null_types = [t for t in subtypes if t is not None and t is not type(None)]
if non_null_types:
return non_null_types[0]
return str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Honor nullable fields even when required (JSON Schema anyOf with null).

Required-but-nullable properties (anyOf including "null") currently become non-nullable, causing validation failures for None. Make parse_type return T|None for simple nullable unions; this preserves nullability independent of "required".

-        if "anyOf" in s:
-            # Handle common pattern for nullable types (anyOf with string and null)
-            subtypes = [sub.get("type") for sub in s["anyOf"] if isinstance(sub, dict) and "type" in sub]
-
-            # Check if this is a simple nullable type (e.g., str | None)
-            if len(subtypes) == NULLABLE_TYPE_LENGTH and "null" in subtypes:
-                # Get the non-null type
-                non_null_type = next(t for t in subtypes if t != "null")
-                # Map it to Python type
-                if isinstance(non_null_type, str):
-                    return {
-                        "string": str,
-                        "integer": int,
-                        "number": float,
-                        "boolean": bool,
-                        "object": dict,
-                        "array": list,
-                    }.get(non_null_type, Any)
-                return Any
-
-            # For other anyOf cases, use the first non-null type
-            subtypes = [parse_type(sub) for sub in s["anyOf"]]
-            non_null_types = [t for t in subtypes if t is not None and t is not type(None)]
-            if non_null_types:
-                return non_null_types[0]
-            return str
+        if "anyOf" in s:
+            # Resolve refs first
+            variants = [resolve_ref(sub) for sub in s["anyOf"] if isinstance(sub, dict)]
+            # Simple nullable: exactly one "null" + one non-null schema
+            if len(variants) == NULLABLE_TYPE_LENGTH and any(v.get("type") == "null" for v in variants):
+                non_null_schema = next((v for v in variants if v.get("type") != "null"), {})
+                base = parse_type(non_null_schema)
+                return base | None
+            # Fallback: prefer first non-null parsed type
+            parsed = [parse_type(v) for v in variants]
+            for t in parsed:
+                if t is not None and t is not type(None):
+                    return t
+            return str

Also applies to: 320-328

@edwinjosechittilappilly edwinjosechittilappilly added chore Maintenance tasks and housekeeping community Pull Request from an external contributor labels Sep 17, 2025
@github-actions github-actions bot added refactor Maintenance tasks and housekeeping and removed refactor Maintenance tasks and housekeeping chore Maintenance tasks and housekeeping labels Sep 17, 2025
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Pull Request from an external contributor refactor Maintenance tasks and housekeeping
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants