feat: Get live provider and model data from model.dev #11007

deon-sanchez · 2025-12-12T22:20:35Z

Summary by CodeRabbit

New Features
- Added live model data integration from models.dev API with intelligent caching and refresh capabilities.
- Expanded model information including costs, token limits, modalities, knowledge cutoff, and documentation links.
- Enhanced provider detection with support for additional model providers and their icons.
Improvements
- Improved modal scrolling for better navigation in provider selection.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-12T22:20:52Z

Walkthrough

This PR adds live model data fetching from the models.dev API with caching support, controlled by the LFX_USE_LIVE_MODEL_DATA environment variable. It expands the model metadata type system with cost, limits, and modalities structures, introduces a new models_dev_client module with cache management and API integration, and updates frontend provider icon mappings. The system maintains backward-compatible static defaults with fallback behavior when live data is unavailable.

Changes

Cohort / File(s)	Change Summary
Environment Configuration `\.env\.example`	Added `LFX_USE_LIVE_MODEL_DATA` environment variable with documentation describing purpose (fetch model data from models.dev API when enabled), supported providers, fallback behavior, and default value; minor formatting alignment in existing comments.
Frontend Provider Mappings `src/frontend/src/controllers/API/queries/models/use-get-model-providers\.ts`	Updated provider-to-icon mappings with renamed entries (e.g., "Google Generative AI" → "GoogleGenerativeAI"), new provider entries (e.g., "Google Vertex AI", "Ollama Cloud", "IBM Watsonx", "SambaNova", "Together AI", "Fireworks AI", "DeepSeek", "xAI", "Alibaba", "Cerebras", Azure/AWS variants), and preserved default "Bot" fallback.
Frontend Modal Styling `src/frontend/src/modals/modelProviderModal/index\.tsx`	Added `overflow-y-auto` to left panel container to enable vertical scrolling on content overflow.
Backend Model Metadata Types `src/lfx/src/lfx/base/models/model_metadata\.py`	Introduced three new TypedDicts (`ModelCost`, `ModelLimits`, `ModelModalities`) for structured pricing, token limits, and input/output modalities; extended `ModelMetadata` with 16 new optional fields (provider_id, display_name, structured_output, temperature, attachment, open_weights, cost, limits, modalities, knowledge_cutoff, release_date, last_updated, api_base, env_vars, documentation_url, model_type); updated `create_model_metadata()` to accept and conditionally include these fields.
Backend Models API Client (New) `src/lfx/src/lfx/base/models/models_dev_client\.py`	New module providing models.dev API integration with 30-second HTTP timeout, local disk caching (1-hour TTL at `~/.cache/langflow/.models_dev_cache.json`), error fallback to stale cache, data transformation via `transform_api_model_to_metadata()`, and public APIs: `fetch_models_dev_data()`, `get_live_models_detailed()`, `get_models_by_provider()`, `search_models()`, `get_provider_metadata_from_api()`, `clear_cache()`.
Backend Model Package API `src/lfx/src/lfx/base/models/__init__\.py`	Expanded public exports to include model metadata types (`ModelCost`, `ModelLimits`, `ModelMetadata`, `ModelModalities`, `create_model_metadata`), live-model utilities (`clear_cache`, `fetch_models_dev_data`, `get_live_models_detailed`, `get_models_by_provider`, `get_provider_metadata_from_api`, `search_models`), and `refresh_live_model_data`; added grouping comments for organized API surface.
Backend Unified Models `src/lfx/src/lfx/base/models/unified_models\.py`	Introduced `USE_LIVE_MODEL_DATA` toggle (environment-driven), added `get_static_model_provider_metadata()`, `get_live_models_as_groups()`, and `refresh_live_model_data()` functions; updated `get_models_detailed()` to conditionally fetch live data from API with fallback to static defaults; imported cache/live-fetch utilities from `models_dev_client`.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client Code
    participant UM as unified_models
    participant MDC as models_dev_client
    participant Cache as Local Cache File
    participant API as models.dev API
    
    Client->>UM: get_models_detailed()
    alt USE_LIVE_MODEL_DATA enabled
        UM->>MDC: get_live_models_detailed()
        MDC->>Cache: _load_cache()
        alt Cache valid (within 1h TTL)
            Cache-->>MDC: cached data
        else Cache missing or expired
            MDC->>API: fetch https://models.dev/api.json
            API-->>MDC: provider & model data
            MDC->>Cache: _save_cache(data)
        end
        loop For each provider & model
            MDC->>MDC: transform_api_model_to_metadata()
            Note over MDC: Map provider IDs,<br/>determine model type,<br/>structure cost/limits
        end
        MDC-->>UM: list[ModelMetadata]
    else Live data unavailable
        UM->>UM: get_static_models_detailed()
        Note over UM: Fallback to static<br/>model definitions
    end
    UM-->>Client: list[ModelMetadata]

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

models_dev_client.py: New 250+ line module with API integration, caching logic, data transformation, and multiple public/internal functions—requires verification of error handling, cache TTL logic, API contract mapping, and provider/model type determination.
model_metadata.py: Introduces three new TypedDicts and significantly extends ModelMetadata structure with 16 optional fields—requires careful review of type annotations, field initialization logic, and backward compatibility.
unified_models.py: Adds conditional live/static resolution with caching imports and new public functions—requires verification of environment variable handling, fallback paths, and interaction with models_dev_client.
Frontend mappings (use-get-model-providers.ts): Provider name/icon mappings are numerous and should be cross-checked for accuracy and consistency with backend provider IDs.

Possibly related PRs

refactor: Update AgentComponent to utilize MODEL_OPTIONS_METADATA from constants #9969: Modifies model provider metadata handling and provider-to-icon/name mappings in the same domain; closely related to this PR's provider metadata expansion and icon mapping updates.

Suggested labels

enhancement, size:L

Suggested reviewers

jordanrfrazier
edwinjosechittilappilly
lucaseduoli

Pre-merge checks and finishing touches

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Test Coverage For New Implementations	❌ Error	PR introduces significant new backend functionality and frontend changes without corresponding test coverage for API caching, error handling, live data resolution, and UI updates.	Add test files for models_dev_client.py, model_metadata.py, update unified_models.py tests, and add frontend tests for provider icons and scrolling features.
Test Quality And Coverage	⚠️ Warning	New backend functionality in models_dev_client.py and unified_models.py lacks test coverage despite extensive testing infrastructure.	Create comprehensive pytest tests for models_dev_client.py, model_metadata.py, and unified_models.py covering API interactions, caching, error handling, and frontend TypeScript tests for use-get-model-providers.ts.
Test File Naming And Structure	❓ Inconclusive	PR adds significant backend functionality (models_dev_client.py with caching and API logic) and frontend changes, but no test files are included in the modified files list.	Verify whether test files exist in separate locations, confirm the project's testing conventions and directory structure, and determine if new tests should accompany the substantial backend changes.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: Get live provider and model data from model.dev' accurately describes the main change: integrating live data fetching from models.dev API for providers and models throughout the codebase.
Docstring Coverage	✅ Passed	Docstring coverage is 95.65% which is sufficient. The required threshold is 80.00%.
Excessive Mock Usage Warning	✅ Passed	The custom check for excessive mock usage is not applicable to this pull request. The PR introduces new production code modules but does not include any test files containing mock objects.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch lfoss-3056

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-12-12T22:22:31Z

Frontend Unit Test Coverage Report

Coverage Summary

Lines	Statements	Branches	Functions
	16.42% (4606/28041)	9.73% (2106/21644)	10.76% (664/6166)

Unit Test Results

Tests	Skipped	Failures	Errors	Time
1803	0 💤	0 ❌	0 🔥	24.646s ⏱️

codecov · 2025-12-12T22:23:47Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 33.02%. Comparing base (f5e68c2) to head (841b365).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #11007      +/-   ##
==========================================
- Coverage   33.02%   33.02%   -0.01%     
==========================================
  Files        1388     1388              
  Lines       65544    65538       -6     
  Branches     9680     9680              
==========================================
- Hits        21648    21642       -6     
  Misses      42798    42798              
  Partials     1098     1098

Flag	Coverage Δ
frontend	`15.16% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...lers/API/queries/models/use-get-model-providers.ts	`84.61% <ø> (ø)`
...c/frontend/src/modals/modelProviderModal/index.tsx	`0.00% <ø> (ø)`
src/lfx/src/lfx/base/models/model_metadata.py	`100.00% <ø> (ø)`
src/lfx/src/lfx/base/models/unified_models.py	`9.04% <ø> (-0.44%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (4)

src/lfx/src/lfx/base/models/unified_models.py (2)
100-113: Consider narrowing the exception type (static analysis warning).

The blind Exception catch at line 109 triggers Ruff BLE001. While the fallback behavior is correct for resilience, consider either:

Catching more specific exceptions (e.g., OSError, ValueError, TimeoutError)

Adding # noqa: BLE001 with a comment explaining why broad catching is intentional here
-    except Exception as e:
+    except Exception as e:  # noqa: BLE001 - Intentional broad catch for graceful fallback
         logger.debug(f"Failed to get live provider metadata: {e}")
133-155: Same BLE001 warning applies here.

Line 153 has the same blind exception catch. Add # noqa: BLE001 with justification for consistency.
-    except Exception as e:
+    except Exception as e:  # noqa: BLE001 - Intentional broad catch for graceful fallback
         logger.debug(f"Failed to fetch live models: {e}")
         return []
src/lfx/src/lfx/base/models/models_dev_client.py (2)
174-189: Use idiomatic string containment check.

Line 186 uses .find() != -1 which is not idiomatic Python. Use the in operator instead.

Apply this diff:
-    if model_data.get("id", "").lower().find("embed") != -1:
+    if "embed" in model_data.get("id", "").lower():
191-227: Consider omitting None values from TypedDict construction.

The transform functions explicitly include optional fields with None values (e.g., lines 199-203). Since these TypedDicts have total=False, it's cleaner to omit None values rather than include them, which also prevents potential serialization issues.

Example for _transform_cost:
 def _transform_cost(cost_data: dict[str, Any] | None) -> ModelCost | None:
     """Transform API cost data to ModelCost format."""
     if not cost_data:
         return None
 
-    return ModelCost(
-        input=cost_data.get("input", 0),
-        output=cost_data.get("output", 0),
-        reasoning=cost_data.get("reasoning"),
-        cache_read=cost_data.get("cache_read"),
-        cache_write=cost_data.get("cache_write"),
-        input_audio=cost_data.get("input_audio"),
-        output_audio=cost_data.get("output_audio"),
-    )
+    result = ModelCost(
+        input=cost_data.get("input", 0),
+        output=cost_data.get("output", 0),
+    )
+    # Only add optional fields if present
+    for key in ["reasoning", "cache_read", "cache_write", "input_audio", "output_audio"]:
+        if key in cost_data:
+            result[key] = cost_data[key]
+    return result
Apply similar logic to _transform_limits and _transform_modalities.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a54e508 and 841b365.

📒 Files selected for processing (7)

.env.example (2 hunks)
src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts (1 hunks)
src/frontend/src/modals/modelProviderModal/index.tsx (1 hunks)
src/lfx/src/lfx/base/models/__init__.py (1 hunks)
src/lfx/src/lfx/base/models/model_metadata.py (1 hunks)
src/lfx/src/lfx/base/models/models_dev_client.py (1 hunks)
src/lfx/src/lfx/base/models/unified_models.py (5 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

src/frontend/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/frontend_development.mdc)

src/frontend/src/**/*.{ts,tsx}: Use React 18 with TypeScript for frontend development
Use Zustand for state management

Files:

src/frontend/src/modals/modelProviderModal/index.tsx
src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts

src/frontend/src/**/*.{tsx,jsx,css,scss}

📄 CodeRabbit inference engine (.cursor/rules/frontend_development.mdc)

Use Tailwind CSS for styling

Files:

src/frontend/src/modals/modelProviderModal/index.tsx

src/frontend/src/**/*.{tsx,jsx}

📄 CodeRabbit inference engine (.cursor/rules/frontend_development.mdc)

src/frontend/src/**/*.{tsx,jsx}: Implement dark mode support using the useDarkMode hook and dark store
Use Lucide React for icon components in the application

Files:

src/frontend/src/modals/modelProviderModal/index.tsx

🧠 Learnings (5)

📚 Learning: 2025-07-11T22:12:46.255Z

Learnt from: namastex888
Repo: langflow-ai/langflow PR: 9018
File: src/frontend/src/modals/apiModal/codeTabs/code-tabs.tsx:244-244
Timestamp: 2025-07-11T22:12:46.255Z
Learning: In src/frontend/src/modals/apiModal/codeTabs/code-tabs.tsx, the inconsistent showLineNumbers setting between Step 1 (false) and Step 2 (true) in the API modal is intentional to prevent breaking the modal height. Step 1 uses showLineNumbers={false} to save vertical space while Step 2 uses showLineNumbers={true} for better readability of longer code.

Applied to files:

src/frontend/src/modals/modelProviderModal/index.tsx

📚 Learning: 2025-07-23T21:19:22.567Z

Learnt from: deon-sanchez
Repo: langflow-ai/langflow PR: 9158
File: src/backend/base/langflow/api/v1/mcp_projects.py:404-404
Timestamp: 2025-07-23T21:19:22.567Z
Learning: In langflow MCP projects configuration, prefer using dynamically computed URLs (like the `sse_url` variable) over hardcoded localhost URLs to ensure compatibility across different deployment environments.

Applied to files:

.env.example

📚 Learning: 2025-11-24T19:46:57.920Z

Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/icons.mdc:0-0
Timestamp: 2025-11-24T19:46:57.920Z
Learning: Applies to src/frontend/src/icons/lazyIconImports.ts : Add icon entries to the `lazyIconsMapping` object in `src/frontend/src/icons/lazyIconImports.ts`. The key must match the backend icon name exactly (case-sensitive) and use dynamic imports: `IconName: () => import("@/icons/IconName").then((mod) => ({ default: mod.IconNameIcon }))`.

Applied to files:

src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts

📚 Learning: 2025-11-24T19:46:57.920Z

Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/icons.mdc:0-0
Timestamp: 2025-11-24T19:46:57.920Z
Learning: Use clear, recognizable icon names (e.g., `"AstraDB"`, `"Postgres"`, `"OpenAI"`). Always use the same icon name for the same service across backend and frontend.

Applied to files:

src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts

📚 Learning: 2025-06-23T12:46:52.420Z

Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/icons.mdc:0-0
Timestamp: 2025-06-23T12:46:52.420Z
Learning: The frontend icon mapping key (in 'lazyIconsMapping') must match the backend 'icon' attribute string exactly, including case sensitivity, to ensure correct icon rendering.

Applied to files:

src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts

🧬 Code graph analysis (2)

src/lfx/src/lfx/base/models/models_dev_client.py (1)

src/lfx/src/lfx/base/models/model_metadata.py (4)

ModelCost (4-13)

ModelLimits (16-20)

ModelMetadata (30-69)

ModelModalities (23-27)

src/lfx/src/lfx/base/models/unified_models.py (1)

src/lfx/src/lfx/base/models/models_dev_client.py (3)

clear_cache (386-394)

get_live_models_detailed (307-350)

get_provider_metadata_from_api (354-383)

🪛 GitHub Actions: Ruff Style Check

src/lfx/src/lfx/base/models/__init__.py

[error] 20-20: RUF022 __all__ is not sorted.

🪛 GitHub Check: Ruff Style Check (3.13)

src/lfx/src/lfx/base/models/models_dev_client.py

[failure] 169-169: Ruff (BLE001)
src/lfx/src/lfx/base/models/models_dev_client.py:169:12: BLE001 Do not catch blind exception: Exception

[failure] 159-159: Ruff (TRY300)
src/lfx/src/lfx/base/models/models_dev_client.py:159:9: TRY300 Consider moving this statement to an else block

src/lfx/src/lfx/base/models/unified_models.py

[failure] 109-109: Ruff (BLE001)
src/lfx/src/lfx/base/models/unified_models.py:109:12: BLE001 Do not catch blind exception: Exception

[failure] 189-189: Ruff (PLW0602)
src/lfx/src/lfx/base/models/unified_models.py:189:12: PLW0602 Using global for MODELS_DETAILED but no assignment is done

[failure] 153-153: Ruff (BLE001)
src/lfx/src/lfx/base/models/unified_models.py:153:12: BLE001 Do not catch blind exception: Exception

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: Test Docker Images / Test docker images
GitHub Check: Run Frontend Tests / Determine Test Suites and Shard Distribution
GitHub Check: Run Frontend Unit Tests / Frontend Jest Unit Tests
GitHub Check: Test Starter Templates
GitHub Check: Update Component Index
GitHub Check: Run Ruff Check and Format
GitHub Check: Update Starter Projects

🔇 Additional comments (13)

src/frontend/src/modals/modelProviderModal/index.tsx (1)

255-255: LGTM! Enables scrolling for overflowing provider list.

The addition of overflow-y-auto correctly handles scenarios where the provider list exceeds the fixed height of 513px, allowing users to scroll through all available providers.

.env.example (1)

143-148: Documentation looks good.

The environment variable documentation is clear and follows the established pattern in this file. It properly documents the purpose, allowed values, default, and fallback behavior.

src/lfx/src/lfx/base/models/model_metadata.py (2)

4-28: Well-structured type definitions.

The new ModelCost, ModelLimits, and ModelModalities TypedDicts are well-designed with total=False for optional fields and clear inline documentation of units and purposes.

72-144: Clean implementation of the factory function.

The conditional assignment pattern (only adding fields when not None) is a good approach to keep the metadata dictionaries lean. The function signature is getting long but remains manageable for a configuration factory.

src/lfx/src/lfx/base/models/unified_models.py (3)

33-35: Environment variable parsing looks good.

The pattern of parsing the env var with a lowercase comparison and default to "false" is correct and defensive.

158-175: Good fallback pattern with appropriate logging.

The live-to-static fallback with a warning log is a solid resilience pattern. The function cleanly encapsulates the toggle behavior.

181-200: > Likely an incorrect or invalid review comment.

src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts (1)

78-100: Verify icon keys match lazyIconImports.ts before merging.

The provider-to-icon mappings reference several custom icons (GoogleGenerativeAI, VertexAI, Mistral, DeepSeek, xAI, WatsonxAI, SambaNova, AWS, Azure, NVIDIA, Ollama) that must be registered in src/frontend/src/icons/lazyIconImports.ts. Icon keys are case-sensitive and must match exactly. Ensure each referenced icon key has a corresponding dynamic import entry in lazyIconImports.ts. The "Bot" fallback is acceptable for unsupported providers (Together AI, Fireworks AI, Alibaba, Cerebras).

src/lfx/src/lfx/base/models/models_dev_client.py (5)

1-93: LGTM: Well-structured module with comprehensive provider mappings.

The imports, constants, and provider mappings are well-organized and comprehensive.

96-120: LGTM: Cache loading logic is sound.

The cache path construction and loading with TTL validation are implemented correctly.

122-131: LGTM: Cache saving is implemented correctly.

Directory creation and error handling are appropriate.

307-351: LGTM: Query function is well-implemented.

The filtering logic and parameter handling are correct.

397-439: LGTM: Convenience and search functions are well-implemented.

The wrapper and search functions provide a clean API surface.

coderabbitai · 2025-12-12T22:27:37Z

src/lfx/src/lfx/base/models/__init__.py

 __all__ = [
+    # Core components
    "LCModelComponent",
+    # Unified models API
    "get_model_provider_variable_mapping",
    "get_model_providers",
    "get_unified_models_detailed",
+    "refresh_live_model_data",
+    # Model metadata types
+    "ModelCost",
+    "ModelLimits",
+    "ModelMetadata",
+    "ModelModalities",
+    "create_model_metadata",
+    # Live models API (models.dev)
+    "clear_live_models_cache",
+    "fetch_models_dev_data",
+    "get_live_models_detailed",
+    "get_models_by_provider",
+    "get_provider_metadata_from_api",
+    "search_models",
 ]


⚠️ Potential issue | 🟡 Minor

Fix: __all__ is not sorted (pipeline failure).

The static analysis is failing because __all__ entries are not sorted alphabetically. While the grouping comments are helpful for organization, Ruff's RUF022 rule requires sorted exports.

Apply this diff to fix the pipeline failure:

__all__ = [ - # Core components "LCModelComponent", - # Unified models API - "get_model_provider_variable_mapping", - "get_model_providers", - "get_unified_models_detailed", - "refresh_live_model_data", - # Model metadata types "ModelCost", "ModelLimits", "ModelMetadata", "ModelModalities", + "clear_live_models_cache", "create_model_metadata", - # Live models API (models.dev) - "clear_live_models_cache", "fetch_models_dev_data", "get_live_models_detailed", + "get_model_provider_variable_mapping", + "get_model_providers", "get_models_by_provider", "get_provider_metadata_from_api", + "get_unified_models_detailed", + "refresh_live_model_data", "search_models", ]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

__all__ = [

# Core components

"LCModelComponent",

# Unified models API

"get_model_provider_variable_mapping",

"get_model_providers",

"get_unified_models_detailed",

"refresh_live_model_data",

# Model metadata types

"ModelCost",

"ModelLimits",

"ModelMetadata",

"ModelModalities",

"create_model_metadata",

# Live models API (models.dev)

"clear_live_models_cache",

"fetch_models_dev_data",

"get_live_models_detailed",

"get_models_by_provider",

"get_provider_metadata_from_api",

"search_models",

]

__all__ = [

"LCModelComponent",

"ModelCost",

"ModelLimits",

"ModelMetadata",

"ModelModalities",

"clear_live_models_cache",

"create_model_metadata",

"fetch_models_dev_data",

"get_live_models_detailed",

"get_model_provider_variable_mapping",

"get_model_providers",

"get_models_by_provider",

"get_provider_metadata_from_api",

"get_unified_models_detailed",

"refresh_live_model_data",

"search_models",

]

🧰 Tools

🪛 GitHub Actions: Ruff Style Check

[error] 20-20: RUF022 __all__ is not sorted.

🪛 GitHub Check: Ruff Style Check (3.13)

[failure] 20-41: Ruff (RUF022)
src/lfx/src/lfx/base/models/init.py:20:11: RUF022 __all__ is not sorted

🤖 Prompt for AI Agents

In src/lfx/src/lfx/base/models/__init__.py around lines 20 to 41, the __all__ list is failing RUF022 because its string entries are not alphabetically sorted; update the list so all exported names (ignore the inline grouping comments) are sorted in ascending alphabetical order, keeping each entry as a quoted string with trailing commas and preserving overall formatting/line breaks.

coderabbitai · 2025-12-12T22:27:37Z

src/lfx/src/lfx/base/models/models_dev_client.py

+def fetch_models_dev_data(*, force_refresh: bool = False) -> dict[str, Any]:
+    """Fetch model data from models.dev API.
+
+    Args:
+        force_refresh: If True, bypass cache and fetch fresh data.
+
+    Returns:
+        Dictionary containing all provider and model data from the API.
+    """
+    # Try cache first
+    if not force_refresh:
+        cached = _load_cache()
+        if cached is not None:
+            logger.debug("Using cached models.dev data")
+            return cached
+
+    # Fetch from API
+    try:
+        with httpx.Client(timeout=30.0) as client:
+            response = client.get(MODELS_DEV_API_URL)
+            response.raise_for_status()
+            data = response.json()
+
+        # Cache the result
+        _save_cache(data)
+        logger.info("Successfully fetched models.dev data")
+        return data
+
+    except httpx.HTTPError as e:
+        logger.warning(f"Failed to fetch models.dev data: {e}")
+        # Try to return stale cache if available
+        cached = _load_cache()
+        if cached is not None:
+            logger.info("Using stale cache due to API error")
+            return cached
+        return {}
+    except Exception as e:
+        logger.error(f"Unexpected error fetching models.dev data: {e}")
+        return {}
+


⚠️ Potential issue | 🔴 Critical

Fix stale cache fallback logic and exception handling.

There are several issues in this function:

Line 164: The stale cache fallback won't work as intended because _load_cache() checks TTL and returns None for expired cache. To retrieve stale cache, you need to read the file directly without TTL validation.

Line 169: Catching bare Exception is too broad and can mask unexpected errors. Be more specific or remove this catch block to let unexpected exceptions propagate.

Line 151: The timeout value 30.0 should be defined as a module-level constant alongside other configuration values.

Apply this diff to fix the stale cache logic:

except httpx.HTTPError as e: logger.warning(f"Failed to fetch models.dev data: {e}") - # Try to return stale cache if available - cached = _load_cache() - if cached is not None: + # Try to return stale cache (bypass TTL check) + cache_path = _get_cache_path() + if cache_path.exists(): + try: + with cache_path.open() as f: + cache_data = json.load(f) + cached = cache_data.get("data") + if cached: + logger.info("Using stale cache due to API error") + return cached + except (OSError, json.JSONDecodeError): + pass - logger.info("Using stale cache due to API error") - return cached return {} - except Exception as e: - logger.error(f"Unexpected error fetching models.dev data: {e}") - return {}

For the timeout constant:

# API Configuration MODELS_DEV_API_URL = "https://models.dev/api.json" CACHE_TTL_SECONDS = 3600 # 1 hour cache TTL +HTTP_TIMEOUT_SECONDS = 30.0 # HTTP request timeout CACHE_FILE_NAME = ".models_dev_cache.json"

- with httpx.Client(timeout=30.0) as client: + with httpx.Client(timeout=HTTP_TIMEOUT_SECONDS) as client:

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 GitHub Check: Ruff Style Check (3.13)

[failure] 169-169: Ruff (BLE001)
src/lfx/src/lfx/base/models/models_dev_client.py:169:12: BLE001 Do not catch blind exception: Exception

[failure] 159-159: Ruff (TRY300)
src/lfx/src/lfx/base/models/models_dev_client.py:159:9: TRY300 Consider moving this statement to an else block

coderabbitai · 2025-12-12T22:27:37Z

src/lfx/src/lfx/base/models/models_dev_client.py

+def transform_api_model_to_metadata(
+    provider_id: str,
+    provider_data: dict[str, Any],
+    model_id: str,
+    model_data: dict[str, Any],
+) -> ModelMetadata:
+    """Transform API model data to ModelMetadata format.
+
+    Args:
+        provider_id: The provider ID from the API (e.g., "openai")
+        provider_data: The provider data from the API
+        model_id: The model ID from the API
+        model_data: The model data from the API
+
+    Returns:
+        ModelMetadata object with transformed data
+    """
+    provider_name = PROVIDER_NAME_MAP.get(provider_id, provider_data.get("name", provider_id.title()))
+    icon = PROVIDER_ICON_MAP.get(provider_id, "bot")  # Default to "bot" if no custom icon
+
+    # Determine model type
+    model_type = _determine_model_type(model_data)
+
+    # Build metadata
+    metadata = ModelMetadata(
+        # Core identification
+        provider=provider_name,
+        provider_id=provider_id,
+        name=model_id,
+        display_name=model_data.get("name", model_id),
+        icon=icon,
+        # Capabilities
+        tool_calling=model_data.get("tool_call", False),
+        reasoning=model_data.get("reasoning", False),
+        structured_output=model_data.get("structured_output", False),
+        temperature=model_data.get("temperature", True),
+        attachment=model_data.get("attachment", False),
+        # Status flags
+        preview="-preview" in model_id.lower() or "beta" in model_id.lower(),
+        not_supported=provider_id not in SUPPORTED_PROVIDERS,
+        deprecated=False,
+        default=False,
+        open_weights=model_data.get("open_weights", False),
+        # Model classification
+        model_type=model_type,
+    )
+
+    # Add extended metadata
+    cost = _transform_cost(model_data.get("cost"))
+    if cost:
+        metadata["cost"] = cost
+
+    limits = _transform_limits(model_data.get("limit"))
+    if limits:
+        metadata["limits"] = limits
+
+    modalities = _transform_modalities(model_data.get("modalities"))
+    if modalities:
+        metadata["modalities"] = modalities
+
+    if model_data.get("knowledge"):
+        metadata["knowledge_cutoff"] = model_data["knowledge"]
+    if model_data.get("release_date"):
+        metadata["release_date"] = model_data["release_date"]
+    if model_data.get("last_updated"):
+        metadata["last_updated"] = model_data["last_updated"]
+
+    # Provider metadata
+    if provider_data.get("api"):
+        metadata["api_base"] = provider_data["api"]
+    if provider_data.get("env"):
+        metadata["env_vars"] = provider_data["env"]
+    if provider_data.get("doc"):
+        metadata["documentation_url"] = provider_data["doc"]
+
+    return metadata


⚠️ Potential issue | 🟡 Minor

Fix icon name inconsistency.

Line 247 uses lowercase "bot" as the default icon, but line 370 uses "Bot" with a capital B, and the PROVIDER_ICON_MAP consistently uses "Bot". This inconsistency could cause icon lookup failures in the frontend.

Apply this diff:

- icon = PROVIDER_ICON_MAP.get(provider_id, "bot") # Default to "bot" if no custom icon + icon = PROVIDER_ICON_MAP.get(provider_id, "Bot") # Default to "Bot" if no custom icon

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def transform_api_model_to_metadata(

provider_id: str,

provider_data: dict[str, Any],

model_id: str,

model_data: dict[str, Any],

) -> ModelMetadata:

"""Transform API model data to ModelMetadata format.

Args:

provider_id: The provider ID from the API (e.g., "openai")

provider_data: The provider data from the API

model_id: The model ID from the API

model_data: The model data from the API

Returns:

ModelMetadata object with transformed data

"""

provider_name = PROVIDER_NAME_MAP.get(provider_id, provider_data.get("name", provider_id.title()))

icon = PROVIDER_ICON_MAP.get(provider_id, "bot") # Default to "bot" if no custom icon

# Determine model type

model_type = _determine_model_type(model_data)

# Build metadata

metadata = ModelMetadata(

# Core identification

provider=provider_name,

provider_id=provider_id,

name=model_id,

display_name=model_data.get("name", model_id),

icon=icon,

# Capabilities

tool_calling=model_data.get("tool_call", False),

reasoning=model_data.get("reasoning", False),

structured_output=model_data.get("structured_output", False),

temperature=model_data.get("temperature", True),

attachment=model_data.get("attachment", False),

# Status flags

preview="-preview" in model_id.lower() or "beta" in model_id.lower(),

not_supported=provider_id not in SUPPORTED_PROVIDERS,

deprecated=False,

default=False,

open_weights=model_data.get("open_weights", False),

# Model classification

model_type=model_type,

)

# Add extended metadata

cost = _transform_cost(model_data.get("cost"))

if cost:

metadata["cost"] = cost

limits = _transform_limits(model_data.get("limit"))

if limits:

metadata["limits"] = limits

modalities = _transform_modalities(model_data.get("modalities"))

if modalities:

metadata["modalities"] = modalities

if model_data.get("knowledge"):

metadata["knowledge_cutoff"] = model_data["knowledge"]

if model_data.get("release_date"):

metadata["release_date"] = model_data["release_date"]

if model_data.get("last_updated"):

metadata["last_updated"] = model_data["last_updated"]

# Provider metadata

if provider_data.get("api"):

metadata["api_base"] = provider_data["api"]

if provider_data.get("env"):

metadata["env_vars"] = provider_data["env"]

if provider_data.get("doc"):

metadata["documentation_url"] = provider_data["doc"]

return metadata

def transform_api_model_to_metadata(

provider_id: str,

provider_data: dict[str, Any],

model_id: str,

model_data: dict[str, Any],

) -> ModelMetadata:

"""Transform API model data to ModelMetadata format.

Args:

provider_id: The provider ID from the API (e.g., "openai")

provider_data: The provider data from the API

model_id: The model ID from the API

model_data: The model data from the API

Returns:

ModelMetadata object with transformed data

"""

provider_name = PROVIDER_NAME_MAP.get(provider_id, provider_data.get("name", provider_id.title()))

icon = PROVIDER_ICON_MAP.get(provider_id, "Bot") # Default to "Bot" if no custom icon

# Determine model type

model_type = _determine_model_type(model_data)

# Build metadata

metadata = ModelMetadata(

# Core identification

provider=provider_name,

provider_id=provider_id,

name=model_id,

display_name=model_data.get("name", model_id),

icon=icon,

# Capabilities

tool_calling=model_data.get("tool_call", False),

reasoning=model_data.get("reasoning", False),

structured_output=model_data.get("structured_output", False),

temperature=model_data.get("temperature", True),

attachment=model_data.get("attachment", False),

# Status flags

preview="-preview" in model_id.lower() or "beta" in model_id.lower(),

not_supported=provider_id not in SUPPORTED_PROVIDERS,

deprecated=False,

default=False,

open_weights=model_data.get("open_weights", False),

# Model classification

model_type=model_type,

)

# Add extended metadata

cost = _transform_cost(model_data.get("cost"))

if cost:

metadata["cost"] = cost

limits = _transform_limits(model_data.get("limit"))

if limits:

metadata["limits"] = limits

modalities = _transform_modalities(model_data.get("modalities"))

if modalities:

metadata["modalities"] = modalities

if model_data.get("knowledge"):

metadata["knowledge_cutoff"] = model_data["knowledge"]

if model_data.get("release_date"):

metadata["release_date"] = model_data["release_date"]

if model_data.get("last_updated"):

metadata["last_updated"] = model_data["last_updated"]

# Provider metadata

if provider_data.get("api"):

metadata["api_base"] = provider_data["api"]

if provider_data.get("env"):

metadata["env_vars"] = provider_data["env"]

if provider_data.get("doc"):

metadata["documentation_url"] = provider_data["doc"]

return metadata

🤖 Prompt for AI Agents

In src/lfx/src/lfx/base/models/models_dev_client.py around lines 229 to 304, the default icon is set to the lowercase string "bot" which is inconsistent with PROVIDER_ICON_MAP and elsewhere using "Bot"; change the default icon value to "Bot" (capital B) so the fallback matches the map and frontend lookups—update the icon assignment to use "Bot" instead of "bot".

coderabbitai · 2025-12-12T22:27:37Z

src/lfx/src/lfx/base/models/models_dev_client.py

+@lru_cache(maxsize=1)
+def get_provider_metadata_from_api() -> dict[str, dict[str, Any]]:
+    """Get provider metadata from the API for all supported providers.
+
+    Returns:
+        Dictionary mapping provider names to their metadata
+    """
+    api_data = fetch_models_dev_data()
+    if not api_data:
+        return {}
+
+    provider_metadata = {}
+    for provider_id, provider_data in api_data.items():
+        if provider_id not in SUPPORTED_PROVIDERS:
+            continue
+
+        provider_name = PROVIDER_NAME_MAP.get(provider_id, provider_data.get("name", provider_id.title()))
+        icon = PROVIDER_ICON_MAP.get(provider_id, "Bot")
+
+        env_vars = provider_data.get("env", [])
+        variable_name = env_vars[0] if env_vars else f"{provider_id.upper()}_API_KEY"
+
+        provider_metadata[provider_name] = {
+            "icon": icon,
+            "variable_name": variable_name,
+            "api_base": provider_data.get("api"),
+            "documentation_url": provider_data.get("doc"),
+            "provider_id": provider_id,
+        }
+
+    return provider_metadata
+


⚠️ Potential issue | 🟠 Major

Address cache coherence issue with lru_cache.

The @lru_cache decorator creates a memory cache that won't be invalidated when clear_cache() is called. This could lead to stale provider metadata being returned from memory even after the disk cache is cleared.

Apply this diff to clear the lru_cache when clearing disk cache:

def clear_cache() -> None: """Clear the models.dev cache.""" cache_path = _get_cache_path() try: if cache_path.exists(): cache_path.unlink() + get_provider_metadata_from_api.cache_clear() logger.info("Cleared models.dev cache") except OSError as e: logger.warning(f"Failed to clear cache: {e}")

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In src/lfx/src/lfx/base/models/models_dev_client.py around lines 353-384, the get_provider_metadata_from_api function is decorated with @lru_cache which causes stale in-memory data after clearing the disk cache; update the code that clears the disk cache (where clear_cache() or similar is implemented) to also call get_provider_metadata_from_api.cache_clear() so the lru_cache is invalidated when disk cache is cleared; if clear_cache() lives in another module, import get_provider_metadata_from_api there and invoke .cache_clear() as part of the cache-clearing routine.

backend updated

841b365

deon-sanchez self-assigned this Dec 12, 2025

github-actions bot added the enhancement New feature or request label Dec 12, 2025

coderabbitai bot reviewed Dec 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Get live provider and model data from model.dev #11007

feat: Get live provider and model data from model.dev #11007

Uh oh!

deon-sanchez commented Dec 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 12, 2025 •

edited

Loading

Pre-merge checks failed

Uh oh!

github-actions bot commented Dec 12, 2025

Uh oh!

codecov bot commented Dec 12, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 12, 2025

Uh oh!

coderabbitai bot Dec 12, 2025

Uh oh!

coderabbitai bot Dec 12, 2025

Uh oh!

coderabbitai bot Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Get live provider and model data from model.dev #11007

Are you sure you want to change the base?

feat: Get live provider and model data from model.dev #11007

Uh oh!

Conversation

deon-sanchez commented Dec 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Pre-merge checks failed

Uh oh!

github-actions bot commented Dec 12, 2025

Frontend Unit Test Coverage Report

Coverage Summary

Unit Test Results

Uh oh!

codecov bot commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

deon-sanchez commented Dec 12, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 12, 2025 •

edited

Loading

codecov bot commented Dec 12, 2025 •

edited

Loading